RE: Can I use boosting fields with edismax ?

2013-11-25 Thread Doug Turnbull
Amit its important to note that dismax/edismax isn't giving you a
weighted average of these field score. Without the tie parameter one
fields score is likely always winning the dismax contest. Field
scores are relative, so 5 could be an amazing score for say title
while 500 a terrible score for text. Dismax picks the field that
yields the maximum score, so the worst text scores might be sorted
higher than the best title match.

Look at your debug output and use that, rather than your sense of
relative field importance, to adjust qf.

I wrote a blog post on this topic that you might find helpful

http://www.opensourceconnections.com/2013/07/02/getting-dissed-by-dismax-why-your-incorrect-assumptions-about-dismax-are-hurting-search-relevancy/

Sent from my Windows Phone From: Amit Aggarwal
Sent: 11/25/2013 6:31 AM
To: solr-user@lucene.apache.org
Subject: Re: Can I use boosting fields with edismax ?
Ok Erick.. I will try thanks
On 25-Nov-2013 2:46 AM, Erick Erickson erickerick...@gmail.com wrote:

 This should work. Try adding debug=all to your URL, and examine
 the output both with and without your boosting. I believe you'll see
 the difference in the score calculations. From there it's a matter
 of adjusting the boosts to get the results you want.


 Best,
 Erick


 On Sat, Nov 23, 2013 at 9:17 AM, Amit Aggarwal amit.aggarwa...@gmail.com
 wrote:

  Hello All ,
 
  I am using defType=edismax
  So will boosting will work like this in solrConfig.xml
 
  str name=qfvalue_search^2.0 desc_search country_search^1.5
  state_search^2.0 city_search^2.5 area_search^3.0/str
 
  I think it is not working ..
 
  If yes , then what should I do ?
 



Solution for MM ignored in edismax queries with operators ?

2013-11-25 Thread Anca Kopetz

Hi,

We found a possible solution for 
SOLR-2649https://issues.apache.org/jira/browse/SOLR-2649 : MM ignored in edismax 
queries with operators. The details are here 
https://issues.apache.org/jira/browse/SOLR-2649?focusedCommentId=13822482page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13822482.

Any feedback is welcome.

Best regards,
Anca Kopetz



Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


In a functon query, I can't get the ValueSource when extend ValueSourceParser

2013-11-25 Thread sling
hi,
I am working with solr4.1.
When I don't parseValueSource, my function query works well. The code is
like this:
public class DateSourceParser extends ValueSourceParser {
@Override
public void init(NamedList namedList) {
}
@Override
*public ValueSource parse(FunctionQParser fp) throws SyntaxError {  

return new DateFunction();
}*
}

When I want to use the ValueSource, like this:
public class DateSourceParser extends ValueSourceParser {
@Override
public void init(NamedList namedList) {
}
@Override
*public ValueSource parse(FunctionQParser fp) throws SyntaxError {
ValueSource source = fp.parseValueSource();
return new DateFunction(source);
}*
}

fp.parseValueSource() throws an error like this:
ERROR [org.apache.solr.core.SolrCore] -
org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError:
Expected identifier at pos 12 str='dateDeboost()'
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:147)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
at
com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:70)
at
com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:173)
at
com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:229)
at
com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:274)
at com.caucho.server.port.TcpConnection.run(TcpConnection.java:514)
at com.caucho.util.ThreadPool.runTasks(ThreadPool.java:527)
at com.caucho.util.ThreadPool.run(ThreadPool.java:449)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.search.SyntaxError: Expected identifier at pos 12
str='dateDeboost()'
at
org.apache.solr.search.QueryParsing$StrParser.getId(QueryParsing.java:747)
at
org.apache.solr.search.QueryParsing$StrParser.getId(QueryParsing.java:726)
at
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:345)
at
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:223)
at
org.sling.solr.custom.DateSourceParser.parse(DateSourceParser.java:24)
at
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352)
at
org.apache.solr.search.FunctionQParser.parse(FunctionQParser.java:68)
at org.apache.solr.search.QParser.getQuery(QParser.java:142)
at
org.apache.solr.search.BoostQParserPlugin$1.parse(BoostQParserPlugin.java:61)
at org.apache.solr.search.QParser.getQuery(QParser.java:142)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:117)
... 13 more


so, how to make fp.parseValueSource() work?

Thanks!!!

sling





--
View this message in context: 
http://lucene.472066.n3.nabble.com/In-a-functon-query-I-can-t-get-the-ValueSource-when-extend-ValueSourceParser-tp4103026.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: distributed search is significantly slower than direct search

2013-11-25 Thread Manuel Le Normand
https://issues.apache.org/jira/browse/SOLR-5478

There it goes

On Mon, Nov 18, 2013 at 5:44 PM, Manuel Le Normand 
manuel.lenorm...@gmail.com wrote:

 Sure, I am out of office till end of week. I reply after i upload the patch



RE: How To Use Multivalued Field Payload at Boosting?

2013-11-25 Thread Markus Jelsma
Solr has no query parsers that support payloads. You would have make your own 
query parser and also create a custom similarity implementing scorePayload for 
it to work. 
 
-Original message-
 From:Furkan KAMACI furkankam...@gmail.com
 Sent: Sunday 24th November 2013 19:07
 To: solr-user@lucene.apache.org
 Subject: How To Use Multivalued Field Payload at Boosting?
 
 I have a multivalued field and they have payloads. How can I use that
 payloads at boosting? (When user searches for a keyword and if a match
 happens at that multivalued field its payload will be added it to the
 general score)
 
 PS: I use Solr 4.5.1 as Cloud.
 


FYI real-time get handler is needed for Solr cloud recovery.

2013-11-25 Thread Daniel Collins
Just had an issue on our Solr cloud and wanted to point this out to the
list at large.

The real-time /get handler is used by Solr Cloud's sync/recovery
mechanism, so *DO NOT* remove it from SolrConfig if you are using Solr
Cloud!

We did (because we weren't using real-time get ourselves and we were trying
to remove all the unnecessary stuff from solrconfig).  What it means is
that whenever a leadership change for a shard happens, ALL the replicas go
into full recovery mode, since they can't determine whether they are in
sync or not.

There seem to get some getVersions messages which are implemented in the
RealTimeGetComponent, and since these are required for cloud recovery,
shouldn't there be more emphasis on this being a required component (or
part of the Core Admin Handler so it can't be configured away?)

We have the comments in schema that the _version field is mandatory for
SolrCloud, I think we at least need something similar for the /get
handler.

I'll log a JIRA for this, but sending here first.


syncronization between replicas

2013-11-25 Thread adfel70
Hi,

We currently running tests on solr to find as many problems in our solr
environment so we can be ready for these kind of problems in production,
anyway we found an edge case and have few questions about it. 

We have one collection with two shards, each shard with replica factor 2.
we are sending docs to the index and everything is okay, now the scenario:
1. take one of the replicas of shard1 down(it doesn't matter which one)
2. continue indexing documents(that's important for this scenario)
3. take down the second replica of shard1(now the shard is down and we
cannot index anymore)
4. take the replica from step 1 up(that's important that this replica will
go up first)
5. take the replica from step 3 up

The regular synchronization flow is that the leader synchronize the other
replica, but I'm pretty sure this is a known issue, is there a way to do a
two way synchronization or do you have any other solution for me?

thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/syncronization-between-replicas-tp4103046.html
Sent from the Solr - User mailing list archive at Nabble.com.


Setting solr.data.dir for SolrCloud instance

2013-11-25 Thread adfel70
I found something strange while trying to create more than one collection in
SolrCloud:
I am running every instance with -Dsolr.data.dir=/data
If I look at Core Admin section, I can see that I have one core and its
dataDir is set to this fixed location. Problem is, if I create a new
collection, another core is created - but with this fixed index location
again.
I was expecting that the path I sent would serve as the BASE path for all
cores the the node hosts. Current behaviour seems like a bug to me, because
obviously one collection will see data that was not indexed to him.
Is there a way to overcome this? I mean, change the default data dir
location, but still be able to create more than one collection correctly?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Setting solr.data.dir for SolrCloud instance

2013-11-25 Thread Erick Erickson
The first thing I'd do is not send an absolute path. What
happens if you just sent -Dsolr.data.dir=data? (no '/')?

We had this discussion a while ago when we were working
on auto-discovery, and it turns out that
there _are_ legitimate cases in which more than one
core/collection can point to the same data dir. You have to very
carefully control who writes to the core, and I wouldn't do it
unless there was no choice, but some people find it useful.

And, in general, I wouldn't mix and match the _core_ admin API
with the _collections_ api unless you're very confident in what
you are doing.

Why isn't just letting the default data.dir location working for you?
There are good reasons to make it explicit, mostly just checking
that you're not over-thinking the problem. Usually they'll be located
in a reasonable place.

Best,
Erick



On Mon, Nov 25, 2013 at 8:12 AM, adfel70 adfe...@gmail.com wrote:

 I found something strange while trying to create more than one collection
 in
 SolrCloud:
 I am running every instance with -Dsolr.data.dir=/data
 If I look at Core Admin section, I can see that I have one core and its
 dataDir is set to this fixed location. Problem is, if I create a new
 collection, another core is created - but with this fixed index location
 again.
 I was expecting that the path I sent would serve as the BASE path for all
 cores the the node hosts. Current behaviour seems like a bug to me, because
 obviously one collection will see data that was not indexed to him.
 Is there a way to overcome this? I mean, change the default data dir
 location, but still be able to create more than one collection correctly?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Parse eDisMax queries for keywords

2013-11-25 Thread Mirko
Hi Jack,
thanks for your reply. Ok in this case I agree that enriching the query
in the application layer is a good idea. We are still a bit puzzled how the
enriched query should look like. I'll post here when we found a solution.
If somebody has suggestions, I'd be happy to hear them.

Mirko


2013/11/21 Jack Krupansky j...@basetechnology.com

 The query parser does its own tokenization and parsing before your
 analyzer tokenizer and filters are called, assuring that only one white
 space-delimited token is analyzed at a time.

 You're probably best off having an application layer preprocessor for the
 query that enriches the query in the manner that you're describing.

 Or, simply settle for a heuristic approach that may give you 70% of what
 you want using only existing Solr features on the server side.

 -- Jack Krupansky

 -Original Message- From: Mirko
 Sent: Thursday, November 21, 2013 5:30 AM
 To: solr-user@lucene.apache.org
 Subject: Parse eDisMax queries for keywords


 Hi,
 We would like to implement special handling for queries that contain
 certain keywords. Our particular use case:

 In the example query Footitle season 1 we want to discover the keywords
 season , get the subsequent number, and boost (or filter for) documents
 that match 1 on field name=season.

 We have two fields in our schema:

 !-- titles contains titles --
 field name=title type=text indexed=true stored=true
 multiValued=false/

 fieldType name=text class=solr.TextField omitNorms=true
analyzer 
charFilter class=solr.MappingCharFilterFactory
 mapping=mapping-ISOLatin1Accent.txt/
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
!-- ... --
/analyzer
 /fieldType

 field name=season type=season_number indexed=true stored=false
 multiValued=false/

 !-- season contains season numbers --
 fieldType name=season_number class=solr.TextField omitNorms=true 
 analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
 filter class=solr.PatternReplaceFilterFactory pattern=.*(?:season)
 *0*([0-9]+).* replacement=$1/
/analyzer
 /fieldType


 Our idea was to use a Keyword tokenizer and a Regex on the season field
 to extract the season number from the complete query.

 However, we use a ExtendedDisMax query parser in our search handler:

 requestHandler name=/select class=solr.SearchHandler
lst name=defaults
str name=defTypeedismax/str
str name=qf
title season
/str

/lst
 /requestHandler


 The problem is that the eDisMax tokenizes the query, so that our field
 season receives the tokens [Foo, season, 1] without any order,
 instead of the complete query.

 How can we pass the complete query (untokenized) to the season field? We
 don't understand which tokenizer is used here and why our season field
 received tokens instead of the complete query.

 Or is there another approach to solve this use case with Solr?

 Thanks,
 Mirko



Re: Suggester - how to return exact match?

2013-11-25 Thread Mirko
Thanks! We solved this issue in the front-end now. I.e. we add the exact
match to the list of suggestions there.

Mirko


2013/11/22 Developer bbar...@gmail.com

 Might not be a perfect solution but you can use edgengram filter and copy
 all
 your field data to that field and use it for suggestion.

 fieldType name=text_autocomplete class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=250 /
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 http://localhost:8983/solr/core1/select?q=name:iphone

 The above query will return
 iphone
 iphone5c
 iphone4g



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Suggester-how-to-return-exact-match-tp4102203p4102521.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Solr 4.x : how to implement an update processor chain working for partial updates

2013-11-25 Thread paule_lecuyer
In my solr schema I have the following fields defined : 

  field name=content type=text_general indexed=false
stored=true multiValued=true /
  field name=all type=text_general indexed=true stored=false
multiValued=true termVectors=true /
  field name=eng type=text_en indexed=true stored=false
multiValued=true termVectors=true /
  field name=ita type=text_it indexed=true stored=false
multiValued=true termVectors=true /
  field name=fre type=text_fr indexed=true stored=false
multiValued=true termVectors=true /
  ...
copyField source=content dest=all/

To fill in the language specific fields, I use a custom update processor
chain, with a custom ConditionalCopyProcessor that copies content field
into appropriate language field, depending on document language (as
explained in http://wiki.apache.org/solr/UpdateRequestProcessor).

Problem is this custom chain is applied on the document passed to the update
request, thus it works all right when inserting a new document or updating
the whole document, where all fields are provided, but it does not when
passed document holds only updated fields (as language-specific fields are
not stored).

I would avoid to set language specific fields to stored=true, as content
field may hold big values. 

Is there a way to have solr execute my ConditionalCopyProcessor on the
actual updated doc (the one resulting from solr retrieving all stored values
and merging with update request values), and not on the request doc ?

Thank a lot for your help.

Paule



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-x-how-to-implement-an-update-processor-chain-working-for-partial-updates-tp4103071.html
Sent from the Solr - User mailing list archive at Nabble.com.


ConcurrentModificationException from XMLResponseWriter

2013-11-25 Thread Shyamsunder R Mutcha


Following exception is found in solr logs. We are using Solr 3.2. As the stack 
trace is not referring to any application classes, I couldn't figure out the 
piece of code that throws this exception. Is there any way to debug this issue?

Is it related to the issue ConcurrentModificationException from 
BinaryResponseWriter 

Nov 25, 2013 7:10:56 AM org.apache.solr.common.SolrException log
SEVERE: java.util.ConcurrentModificationException
        at 
java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
        at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:392)
        at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:391)
        at org.apache.solr.response.XMLWriter.writeMap(XMLWriter.java:644)
        at org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:591)
        at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
        at 
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
        at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:343)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)
        at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
        at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
        at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
        at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
        at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
        at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
        at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
        at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
        at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
        at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
        at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
        at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
        at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
        at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
        at java.lang.Thread.run(Thread.java:662)

Thanks

Trouble with manually routed collection after upgrade to 4.6

2013-11-25 Thread Brett Hoerner
Hi, I've been using a collection on Solr 4.5.X for a few weeks and just did
an upgrade to 4.6 and am having some issues.

First: this collection is, I guess, implicitly routed. I do this for every
document insert using SolrJ:

  document.addField(_route_, shardId)

After upgrading the servers to 4.6 I now get the following on every
insert/delete when using either SolrJ 4.5.1 or 4.6:

  org.apache.solr.common.SolrException: No active slice servicing hash code
17b9dff6 in DocCollection

In the clusterstate *none* of my shards have a range set (they're all
null), but I thought this would be expected since I do routing myself.

Did the upgrade change something here? I didn't see anything related to
this in the upgrade notes.

Thanks,
Brett


RE: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-25 Thread Markus Jelsma


 
 
-Original message-
 From:Otis Gospodnetic otis.gospodne...@gmail.com
 Sent: Wednesday 20th November 2013 16:40
 To: solr-user@lucene.apache.org
 Subject: Multiple data/index.MMDD dirs == bug?
 
 Hi,
 
 When full index replication is happening via SnapPuller, a temporary
 timestamped index dir is created.
 
 Questions:
 1) Under normal circumstances could more than 1 timestamped index
 directory ever be present?

No, except during replication.
 2) Should there always be an the .../data/index directory present?

No, the directory can also be index.TIME. It is pointed to from 
index.properties.

 
 I'm asking because I see the following situation on one SolrCloud node:
 
 $ du -ms /home/solr/data/*
 1188367/home/solr/data/index.20131118152402344
 709050/home/solr/data/index.20131119210950598
 1/home/solr/data/index.properties
 1/home/solr/data/replication.properties
 3053/home/solr/data/tlog
 
 Note:
 1) there are 2 timestamped directories
 2) there is no data/index directory

This is not good but you can safely remove all that are not in 
index.properties, usually keep only the newest.

 
 According to SnapPuller, the timestamped index dir is a temporary dir
 and should be removed after replication. unless maybe some error
 case is not being handled correctly and timestamped index dirs are
 leaking.

It can happen when Solr dies, they are not removed on start up.

 
 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/
 


Re: ConcurrentModificationException from XMLResponseWriter

2013-11-25 Thread Shawn Heisey
On 11/25/2013 8:43 AM, Shyamsunder R Mutcha wrote:
 
 
 Following exception is found in solr logs. We are using Solr 3.2. As the 
 stack trace is not referring to any application classes, I couldn't figure 
 out the piece of code that throws this exception. Is there any way to debug 
 this issue?
 
 Is it related to the issue ConcurrentModificationException from 
 BinaryResponseWriter 
 
 Nov 25, 2013 7:10:56 AM org.apache.solr.common.SolrException log
 SEVERE: java.util.ConcurrentModificationException
 at 
 java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
 at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:392)
 at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:391)
 at org.apache.solr.response.XMLWriter.writeMap(XMLWriter.java:644)

The exception is coming from LinkedHashMap, a built-in Java object type.

http://docs.oracle.com/javase/6/docs/api/java/util/LinkedHashMap.html

The code that made the call that's failing is line 644 of this source
code file:

solr/core/src/java/org/apache/solr/response/XMLWriter.java

I looked at the 3.2 source code.  What's going on here is fairly normal
- it's interating through a Map and outputting the data contained there
to the writer.

The actual problem is occurring elsewhere, it's only showing up in
XMLWriter due to the way LinkedHashMap objects work.  Another thread has
modified the Map while the iterator is being used. This is something
you're not allowed to do with this object type, so it throws the exception.

I can't find any existing Solr bugs, so the question is: Are you using
any custom code with Solr?  Perhaps something you downloaded or
purchased, or something you wrote in-house?  If so, then that code has
some bugs.

If this *is* a bug in Solr 3.x, it is highly unlikely that it will get
fixed, at least in a 3.x version.  If it still exists in version 4.x
(which is unlikely), then it will get fixed there.  Version 3.2 is two
years old, and the entire 3.x branch is in maintenance mode, meaning
that only EXTREMELY severe bugs will be fixed.

Thanks,
Shawn



Re: Trouble with manually routed collection after upgrade to 4.6

2013-11-25 Thread Brett Hoerner
Here's my clusterstate.json:

  https://gist.github.com/bretthoerner/a8120a8d89c93f773d70


On Mon, Nov 25, 2013 at 10:18 AM, Brett Hoerner br...@bretthoerner.comwrote:

 Hi, I've been using a collection on Solr 4.5.X for a few weeks and just
 did an upgrade to 4.6 and am having some issues.

 First: this collection is, I guess, implicitly routed. I do this for every
 document insert using SolrJ:

   document.addField(_route_, shardId)

 After upgrading the servers to 4.6 I now get the following on every
 insert/delete when using either SolrJ 4.5.1 or 4.6:

   org.apache.solr.common.SolrException: No active slice servicing hash
 code 17b9dff6 in DocCollection

 In the clusterstate *none* of my shards have a range set (they're all
 null), but I thought this would be expected since I do routing myself.

 Did the upgrade change something here? I didn't see anything related to
 this in the upgrade notes.

 Thanks,
 Brett



Re: How To Use Multivalued Field Payload at Boosting?

2013-11-25 Thread Furkan KAMACI
Is there any example for it?


2013/11/25 Markus Jelsma markus.jel...@openindex.io

 Solr has no query parsers that support payloads. You would have make your
 own query parser and also create a custom similarity implementing
 scorePayload for it to work.

 -Original message-
  From:Furkan KAMACI furkankam...@gmail.com
  Sent: Sunday 24th November 2013 19:07
  To: solr-user@lucene.apache.org
  Subject: How To Use Multivalued Field Payload at Boosting?
 
  I have a multivalued field and they have payloads. How can I use that
  payloads at boosting? (When user searches for a keyword and if a match
  happens at that multivalued field its payload will be added it to the
  general score)
 
  PS: I use Solr 4.5.1 as Cloud.
 



Re: Trouble with manually routed collection after upgrade to 4.6

2013-11-25 Thread Brett Hoerner
Think I got it. For some reason this was in my clusterstate.json after the
upgrade (note that I was using 4.5.X just fine previously...):

 router: {
   name: compositeId
 },

I stopped all my nodes and manually edited this to me implicit (is there
a tool for this? I've always done it manually), started the cluster up
again and it's all good now.



On Mon, Nov 25, 2013 at 10:38 AM, Brett Hoerner br...@bretthoerner.comwrote:

 Here's my clusterstate.json:

   https://gist.github.com/bretthoerner/a8120a8d89c93f773d70


 On Mon, Nov 25, 2013 at 10:18 AM, Brett Hoerner br...@bretthoerner.comwrote:

 Hi, I've been using a collection on Solr 4.5.X for a few weeks and just
 did an upgrade to 4.6 and am having some issues.

 First: this collection is, I guess, implicitly routed. I do this for
 every document insert using SolrJ:

   document.addField(_route_, shardId)

 After upgrading the servers to 4.6 I now get the following on every
 insert/delete when using either SolrJ 4.5.1 or 4.6:

   org.apache.solr.common.SolrException: No active slice servicing hash
 code 17b9dff6 in DocCollection

 In the clusterstate *none* of my shards have a range set (they're all
 null), but I thought this would be expected since I do routing myself.

 Did the upgrade change something here? I didn't see anything related to
 this in the upgrade notes.

 Thanks,
 Brett





Re: Cloning shards = cloning collections

2013-11-25 Thread Otis Gospodnetic
Hi,

As a matter of fact, what about exposing a new Collection API CLONE command
and having Solr simply copy all the needed shards and replicas at the FS
level, would that work (or not because of different Directory
implementations that may not all lend themselves to being simply copied)?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Mon, Nov 25, 2013 at 12:10 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Hi,

 In http://search-lucene.com/m/O1O2r14sU811 Shalin wrote:

 The splitting process is nothing but the creation of a bitset with
 which a LiveDocsReader is created. These readers are then added to the
 a new index via IW.addIndexes(IndexReader[] readers) method.

 ... which makes me wonder couldn't the same mechanism be used to clone
 shards and thus allow us to clone/duplicate a whole collection?  A handy
 feature, IMHO.

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/





Re: ConcurrentModificationException from XMLResponseWriter

2013-11-25 Thread Shyamsunder R Mutcha
Shawn,

We have custom search handlers that uses in built components - result and facet 
to generate the results. I see that our facet generation is using the 
LinkedHashMap. I will revisit my code. Thanks for the advise!!!

We are migrating to Solr4 soon :)

Thanks



On Monday, November 25, 2013 11:28 AM, Shawn Heisey s...@elyograg.org wrote:
 
On 11/25/2013 8:43 AM, Shyamsunder R Mutcha wrote:
 
 
 Following exception is found in solr logs. We are using Solr 3.2. As the 
 stack trace is not referring to any application classes, I couldn't figure 
 out the piece of code that throws this exception. Is there any way to debug 
 this issue?
 
 Is it related to the issue ConcurrentModificationException from 
 BinaryResponseWriter 
 
 Nov 25, 2013 7:10:56 AM org.apache.solr.common.SolrException log
 SEVERE: java.util.ConcurrentModificationException
         at 
java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
         at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:392)
         at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:391)
         at org.apache.solr.response.XMLWriter.writeMap(XMLWriter.java:644)

The exception is coming from LinkedHashMap, a built-in Java object type.

http://docs.oracle.com/javase/6/docs/api/java/util/LinkedHashMap.html

The code that made the call that's failing is line 644 of this source
code file:

solr/core/src/java/org/apache/solr/response/XMLWriter.java

I looked at the 3.2 source code.  What's going on here is fairly normal
- it's interating through a Map and outputting the data contained there
to the writer.

The actual problem is occurring elsewhere, it's only showing up in
XMLWriter due to the way LinkedHashMap objects work.  Another thread has
modified the Map while the iterator is being used. This is something
you're not allowed to do with this object type, so it throws the exception.

I can't find any existing Solr bugs, so the question is: Are you using
any custom code with Solr?  Perhaps something you downloaded or
purchased, or something you wrote in-house?  If so, then that code has
some bugs.

If this *is* a bug in Solr 3.x, it is highly unlikely that it will get
fixed, at least in a 3.x version.  If it still exists in version 4.x
(which is unlikely), then it will get fixed there.  Version 3.2 is two
years old, and the entire 3.x branch is in maintenance mode, meaning
that only EXTREMELY severe bugs will be fixed.


Thanks,
Shawn

Revolution writeup

2013-11-25 Thread Michael Sokolov
I just posted a writeup of the Lucene/Solr Revolution Dublin 
conference.  I've been waiting for videos to become available, but I got 
impatient.  Slides are there, mostly though.  Sorry if I missed your 
talk -- I'm hoping to catch up when the videos are posted...


http://blog.safariflow.com/2013/11/25/this-revolution-will-be-televised/

-Mike Sokolov


Re: Solr 4.x : how to implement an update processor chain working for partial updates

2013-11-25 Thread Chris Hostetter
: 
: Is there a way to have solr execute my ConditionalCopyProcessor on the
: actual updated doc (the one resulting from solr retrieving all stored values
: and merging with update request values), and not on the request doc ?

Partial Updates, and loading the existing stored fields of a document 
that is being partially updated, happens in the DistributedUpdateProcessor 
as part of hte leader logic (so that we can be confident we have the 
correct field values and _version_ info even if there are competing 
updates to the same document)

if you configure your update processor to happen *after* the 
DistributedUpdateProcessor, then the document will be fuly populated -- 
unfortunatly.  the down side however is that your processorwill be run 
redundently on each replica, which can be anoying if it's a resource 
intensive update processor or requires hitting an external resource.

NOTE: even if you aren't using SolrCloud, you still get an implicit 
instance of DistributedUpdateProcessor precisely so that partial updates 
will work...

https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates



-Hoss


Re: In a functon query, I can't get the ValueSource when extend ValueSourceParser

2013-11-25 Thread Chris Hostetter

I'm not sure i understand your question - largely because you've only 
provided a small sample of information aboutwhat you are doing, and not 
giving a full picture.

what are you actually trying to accomplish?  
With your custom ValueSourceParser, what input are you sending to solr 
that generates that error?  
what does your DateFunction do?

Best i can tell from the information provided, you've registered your 
DateSourceParser using the name 'dateDeboost' (just a guess, you never 
actaully said) and then you tried using it in a request in 
some way (boost function?) as 'dateDeboost()' (just guessing based on the 
error message)

In which case this error is entirely expected, because your parse 
implementation says that you expect your function to be passed as input 
another vlaue source -- but when you called your function (in the input 
string 'dateDeboost()') you didn't specify any arguments at all - let 
alone an input argument that could be evaluated as a nested ValueSource.




: Date: Mon, 25 Nov 2013 02:11:43 -0800 (PST)
: From: sling sling...@gmail.com
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: In a functon query,
: I can't get the ValueSource when extend ValueSourceParser
: 
: hi,
: I am working with solr4.1.
: When I don't parseValueSource, my function query works well. The code is
: like this:
: public class DateSourceParser extends ValueSourceParser {
:   @Override
:   public void init(NamedList namedList) {
:   }
:   @Override
:   *public ValueSource parse(FunctionQParser fp) throws SyntaxError {  

:   return new DateFunction();
:   }*
: }
: 
: When I want to use the ValueSource, like this:
: public class DateSourceParser extends ValueSourceParser {
:   @Override
:   public void init(NamedList namedList) {
:   }
:   @Override
:   *public ValueSource parse(FunctionQParser fp) throws SyntaxError {
:   ValueSource source = fp.parseValueSource();
:   return new DateFunction(source);
:   }*
: }
: 
: fp.parseValueSource() throws an error like this:
: ERROR [org.apache.solr.core.SolrCore] -
: org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError:
: Expected identifier at pos 12 str='dateDeboost()'
: at
: 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:147)
: at
: 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187)
: at
: 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
: at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
: at
: 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
: at
: 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
: at
: 
com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:70)
: at
: 
com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:173)
: at
: 
com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:229)
: at
: com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:274)
: at com.caucho.server.port.TcpConnection.run(TcpConnection.java:514)
: at com.caucho.util.ThreadPool.runTasks(ThreadPool.java:527)
: at com.caucho.util.ThreadPool.run(ThreadPool.java:449)
: at java.lang.Thread.run(Thread.java:662)
: Caused by: org.apache.solr.search.SyntaxError: Expected identifier at pos 12
: str='dateDeboost()'
: at
: org.apache.solr.search.QueryParsing$StrParser.getId(QueryParsing.java:747)
: at
: org.apache.solr.search.QueryParsing$StrParser.getId(QueryParsing.java:726)
: at
: 
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:345)
: at
: 
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:223)
: at
: org.sling.solr.custom.DateSourceParser.parse(DateSourceParser.java:24)
: at
: 
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:352)
: at
: org.apache.solr.search.FunctionQParser.parse(FunctionQParser.java:68)
: at org.apache.solr.search.QParser.getQuery(QParser.java:142)
: at
: org.apache.solr.search.BoostQParserPlugin$1.parse(BoostQParserPlugin.java:61)
: at org.apache.solr.search.QParser.getQuery(QParser.java:142)
: at
: 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:117)
: ... 13 more
: 
: 
: so, how to make fp.parseValueSource() work?
: 
: Thanks!!!
: 
: sling
: 
: 
: 
: 
: 
: --
: View this message in context: 
http://lucene.472066.n3.nabble.com/In-a-functon-query-I-can-t-get-the-ValueSource-when-extend-ValueSourceParser-tp4103026.html
: Sent from the Solr - User mailing list archive at Nabble.com.
: 

-Hoss


Re: csv does not return custom fields (distance)

2013-11-25 Thread Chris Hostetter

It's a known issue, support for returning psuedo-fields in the CSV 
response writer was never implemented.  Need someone to spend some time 
working up a patch to add it...

https://issues.apache.org/jira/browse/SOLR-5423


: Date: Wed, 20 Nov 2013 20:55:53 -0800 (PST)
: From: GaneshSe ganeshmail...@gmail.com
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: csv does not return custom fields (distance)
: 
: I am using spacial search feature in Solr (4.0) version. 
: 
: When I try to extract the csv (using wt=csv option) using the edismax
: parser, I dont get all the fields in the CSV output as specified in the fl
: parameter. Only the schema fields are coming out in CSV and the score, the
: custom fields like distance as specified/bolded below does not come out in
: the csv file. But i am able to get the same in the wt=xml option. 
: 
: 
q=+(Name:abcd)sfield=locationrows=100defType=edismaxpt=40.721587,-73.886938q.op=ORisShard=truestart=0fl=*,score,*dist:geodist()*wt=csv
: 
: Above is not complete query
: 
: I would like to have distance in the CSV output, any help please?
: 
: 
: 
: 
: --
: View this message in context: 
http://lucene.472066.n3.nabble.com/csv-does-not-return-custom-fields-distance-tp4102313.html
: Sent from the Solr - User mailing list archive at Nabble.com.
: 

-Hoss


New to Solr - Need advice on clustering

2013-11-25 Thread Anders Kåre Olsen
Hi Solr-users

I’m trying to setup Solr for search and indexing on the project I’m working on.

My project is a e-commerce B2B solution. We are planning on setting up 2 
frontend servers for the website, and I was planning on installing Solr on 
these servers. We are using Windows Server 2012 for the frontend servers.

We are not expecting a huge load on the servers, so we expect these 2 servers 
to be adequate to handle both the website and search index.

I have been looking at SolrCloud and ZooKeeper. Howver I have read that you 
need at least 3 ZooKeepers in an ensamble, and I only have 2 servers.

I need to handle the situation where one of the servers crashes, so I need both 
servers to have a Solr index.

Do you have any advise on the best setup for my situation?

Thank you for your help.

Regards
Anders Olsen

POLL: Solr vs. SolrCloud usage

2013-11-25 Thread Otis Gospodnetic
Hi,

It would be great to see what Solr people are using - Solr or SolrCloud:

Vote == http://blog.sematext.com/2013/11/25/poll-solr-cloud-usage/

Here are a couple of old polls, if you are curious about this sort of stuff
like I am:

* http://blog.sematext.com/2013/02/25/poll-solr-cloud-or-not/ -- from 9
months ago

* http://blog.sematext.com/2013/02/15/poll-which-solr-version-are-you-using/

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


Re: building custom cache - using lucene docids

2013-11-25 Thread Roman Chyla
On Sun, Nov 24, 2013 at 8:31 AM, Erick Erickson erickerick...@gmail.comwrote:

 bq: Do i understand you correctly that when two segmets get merged, the
 docids
 (of the original segments) remain the same?

 The original segments are unchanged, segments are _never_ changed after
 they're closed. But they'll be thrown away. Say you have segment1 and
 segment2 that get merged into segment3. As soon as the last searcher
 that is looking at segment1 and segment2 is closed, those two segments
 will be deleted from your disk.

 But for any given doc, the docid in segment3 will very likely be different
 than it was in segment1 or 2.


i'm trying to figure this out - i'll have to dig, i suppose. for example,
if the docbase (the docid offset per searcher) was stored together with the
index segment, that would be an indication of 'relative stability of docids'



 I think you're reading too much into LUCENE-2897. I'm pretty sure the
 segment in question is not available to you anyway before this rewrite is
 done,
 but freely admit I don't know much about it.


i've done tests, committing and overwriting a document and saw (SOLR4.0)
that docids are being recycled. I deleted 2 docs, then added a new document
and guess what: the new document had the docid of the previously deleted
document (but different fields).

That was new to me, so I searched and found the LUCENE-2897 which seemed to
explain that behaviour.



 You're probably going to get into the whole PerSegment family of
 operations,
 which is something I'm not all that familiar with so I'll leave
 explanations
 to others.


Thank you, it is useful to get insights from various sides,

  roman



 On Sat, Nov 23, 2013 at 8:22 PM, Roman Chyla roman.ch...@gmail.com
 wrote:

  Hi Erick,
  Many thanks for the info. An additional question:
 
  Do i understand you correctly that when two segmets get merged, the
 docids
  (of the original segments) remain the same?
 
  (unless, perhaps in situation, they were merged using the last index
  segment which was opened for writing and where the docids could have
  suddenly changed in a commit just before the merge)
 
  Yes, you guessed right that I am putting my code into the custom cache -
 so
  it gets notified on index changes. I don't know yet how, but I think I
 can
  find the way to the current active, opened (last) index segment. Which is
  actively updated (as opposed to just being merged) -- so my definition of
  'not last ones' is: where docids don't change. I'd be grateful if someone
  could spot any problem with such assumption.
 
  roman
 
 
 
 
  On Sat, Nov 23, 2013 at 7:39 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   bq: But can I assume
   that docids in other segments (other than the last one) will be
  relatively
   stable?
  
   Kinda. Maybe. Maybe not. It depends on how you define other than the
   last one.
  
   The key is that the internal doc IDs may change when segments are
   merged. And old segments get merged. Doc IDs will _never_ change
   in a segment once it's closed (although as you note they may be
   marked as deleted). But that segment may be written to a new segment
   when merging and the internal ID for a given document in the new
   segment bears no relationship to internal ID in the old segment.
  
   BTW, I think you only really care when opening a new searchers. There
 is
   a UserCache (see solrconfig.xml) that gets notified when a new searcher
   is being opened to give it an opportunity to refresh itself, is that
   useful?
  
   As long as a searcher is open, it's guaranteed that nothing is
 changing.
   Hard commits with openSearcher=false don't open new searchers, which
   is why changes aren't visible until a softCommit or a hard commit with
   openSearcher=true despite the fact that the segments are closed.
  
   FWIW,
   Erick
  
   Best
   Erick
  
  
  
   On Sat, Nov 23, 2013 at 12:40 AM, Roman Chyla roman.ch...@gmail.com
   wrote:
  
Hi,
docids are 'ephemeral', but i'd still like to build a search cache
 with
them (they allow for the fastest joins).
   
i'm seeing docids keep changing with updates (especially, in the last
   index
segment) - as per
https://issues.apache.org/jira/browse/LUCENE-2897
   
That would be fine, because i could build the cache from diff (of
 index
state) + reading the latest index segment in its entirety. But can I
   assume
that docids in other segments (other than the last one) will be
   relatively
stable? (ie. when an old doc is deleted, the docid is marked as
  removed;
update doc = delete old  create a new docid)?
   
thanks
   
roman
   
  
 



Re: building custom cache - using lucene docids

2013-11-25 Thread Roman Chyla
On Sun, Nov 24, 2013 at 10:44 AM, Jack Krupansky j...@basetechnology.comwrote:

 We should probably talk about internal Lucene document IDs and
 external or rebased Lucene document IDs. The internal document IDs are
 always per-segment and never, ever change for that closed segment. But...
 the application would not normally see these IDs. Usually the externally
 visible Lucene document IDs have been rebased to add the sum total count
 of documents (both existing and deleted) of all preceding segments to the
 document IDs of a given segment, producing a global (across the full
 index of all segments) Lucene document ID.

 So, if you have those three segments, with deleted documents in the first
 two segments, and then merge those first two segments, the
 externally-visible Lucene document IDs for the third segment will suddenly
 all be different, shifted lower by the number of deleted documents that
 were just merged away, even though nothing changed in the third segment
 itself.


That's right, and I'm starting to think that if i keep the segment id and
the original offset, i don't need to rebuild that part of the cache,
because it has not been rebased (but I can always update the deleted docs).
It seems simple so I'm suspecting to find a catch somewhere. but if it
works, that could potentially speed up any cache building

Do you have information where the docbase of the segment are stored? Or
which java class I should start my exploration from? [it is somewhat
sprawling complex, so I'm bit lost :)]



 Maybe these should be called local (to the segment) Lucene document IDs
 and global (across all segment) Lucene document IDs. Or, maybe internal
 vs. external is good enough.

 In short, it is completely safe to use and save Lucene document IDs, but
 only as long as no merging of segments is performed. Even one tiny merge
 and all subsequent saved document IDs are invalidated. Be careful with your
 merge policy - normally merges are happening in the background,
 automatically.


my tests, as per previous email, showed that the last segment docid's are
not that stable. I don't know if it matters that I used the RAMDirectory
for the test, but the docids were being 'recycled' -  the deleted docs were
in the previous segment, then suddently their docids were inside newly
added documents (so maybe solr/lucene is not counting deleted docs, if they
are at the end of a segment...?) i don't know. i'll need to explore the
index segments to understand what was going on there, thanks for any
possible pointers


  roman





 -- Jack Krupansky

 -Original Message- From: Erick Erickson
 Sent: Sunday, November 24, 2013 8:31 AM
 To: solr-user@lucene.apache.org
 Subject: Re: building custom cache - using lucene docids


 bq: Do i understand you correctly that when two segmets get merged, the
 docids
 (of the original segments) remain the same?

 The original segments are unchanged, segments are _never_ changed after
 they're closed. But they'll be thrown away. Say you have segment1 and
 segment2 that get merged into segment3. As soon as the last searcher
 that is looking at segment1 and segment2 is closed, those two segments
 will be deleted from your disk.

 But for any given doc, the docid in segment3 will very likely be different
 than it was in segment1 or 2.

 I think you're reading too much into LUCENE-2897. I'm pretty sure the
 segment in question is not available to you anyway before this rewrite is
 done,
 but freely admit I don't know much about it.

 You're probably going to get into the whole PerSegment family of
 operations,
 which is something I'm not all that familiar with so I'll leave
 explanations
 to others.


 On Sat, Nov 23, 2013 at 8:22 PM, Roman Chyla roman.ch...@gmail.com
 wrote:

  Hi Erick,
 Many thanks for the info. An additional question:

 Do i understand you correctly that when two segmets get merged, the docids
 (of the original segments) remain the same?

 (unless, perhaps in situation, they were merged using the last index
 segment which was opened for writing and where the docids could have
 suddenly changed in a commit just before the merge)

 Yes, you guessed right that I am putting my code into the custom cache -
 so
 it gets notified on index changes. I don't know yet how, but I think I can
 find the way to the current active, opened (last) index segment. Which is
 actively updated (as opposed to just being merged) -- so my definition of
 'not last ones' is: where docids don't change. I'd be grateful if someone
 could spot any problem with such assumption.

 roman




 On Sat, Nov 23, 2013 at 7:39 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  bq: But can I assume
  that docids in other segments (other than the last one) will be
 relatively
  stable?
 
  Kinda. Maybe. Maybe not. It depends on how you define other than the
  last one.
 
  The key is that the internal doc IDs may change when segments are
  merged. And old segments get merged. Doc IDs will _never_ change
  

Re: building custom cache - using lucene docids

2013-11-25 Thread Roman Chyla
On Mon, Nov 25, 2013 at 12:54 AM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 Roman,

 I don't fully understand your question. After segment is flushed it's never
 changed, hence segment-local docids are always the same. Due to merge
 segment can gone, its' docs become new ones in another segment.  This is
 true for 'global' (Solr-style) docnums, which can flip after merge is
 happened in the middle of the segments' chain.
 As well you are saying about segmented cache I can propose you to look at
 CachingWrapperFilter and NoOpRegenerator as a pattern for such data
 structures.


Thanks Mikhail, the CWF confirms that the idea of regenerating just part of
the cache is doable. The CacheRegenerators, on the other hand, make no
sense to me - and they are not given any 'signals', so they don't know if
they are in the middle of some regeneration or not, and they should not
keep a state (of previous index) - as they can be shared by threads that
build the cache

Best,

  roman




 On Sat, Nov 23, 2013 at 9:40 AM, Roman Chyla roman.ch...@gmail.com
 wrote:

  Hi,
  docids are 'ephemeral', but i'd still like to build a search cache with
  them (they allow for the fastest joins).
 
  i'm seeing docids keep changing with updates (especially, in the last
 index
  segment) - as per
  https://issues.apache.org/jira/browse/LUCENE-2897
 
  That would be fine, because i could build the cache from diff (of index
  state) + reading the latest index segment in its entirety. But can I
 assume
  that docids in other segments (other than the last one) will be
 relatively
  stable? (ie. when an old doc is deleted, the docid is marked as removed;
  update doc = delete old  create a new docid)?
 
  thanks
 
  roman
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com



Re: New to Solr - Need advice on clustering

2013-11-25 Thread Gora Mohanty
On 26 November 2013 01:44, Anders Kåre Olsen a...@mail.dk wrote:
 Hi Solr-users

 I’m trying to setup Solr for search and indexing on the project I’m working 
 on.

 My project is a e-commerce B2B solution. We are planning on setting up 2 
 frontend servers for the website, and I was planning on installing Solr on 
 these servers. We are using Windows Server 2012 for the frontend servers.

 We are not expecting a huge load on the servers, so we expect these 2 servers 
 to be adequate to handle both the website and search index.

 I have been looking at SolrCloud and ZooKeeper. Howver I have read that you 
 need at least 3 ZooKeepers in an ensamble, and I only have 2 servers.

 I need to handle the situation where one of the servers crashes, so I need 
 both servers to have a Solr index.
[...]

If you do not want to get into SolrCloud, a simpler
solution might be a HTTP load balancer in front of
the two Solr instances. Hardware load balancers are
better, but more expensive. A software load balancer
like haproxy should meet your needs.

Regards,
Gora


Re: Solr 4.x : how to implement an update processor chain working for partial updates

2013-11-25 Thread Alexandre Rafalovitch
SOLR-5395 just out with 4.6 might have some relevance here (RunAlways
marker interface for UpdateRequestProcessorFactory). Not sure how it
affects partial updates though.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Tue, Nov 26, 2013 at 1:44 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:

 :
 : Is there a way to have solr execute my ConditionalCopyProcessor on the
 : actual updated doc (the one resulting from solr retrieving all stored
 values
 : and merging with update request values), and not on the request doc ?

 Partial Updates, and loading the existing stored fields of a document
 that is being partially updated, happens in the DistributedUpdateProcessor
 as part of hte leader logic (so that we can be confident we have the
 correct field values and _version_ info even if there are competing
 updates to the same document)

 if you configure your update processor to happen *after* the
 DistributedUpdateProcessor, then the document will be fuly populated --
 unfortunatly.  the down side however is that your processorwill be run
 redundently on each replica, which can be anoying if it's a resource
 intensive update processor or requires hitting an external resource.

 NOTE: even if you aren't using SolrCloud, you still get an implicit
 instance of DistributedUpdateProcessor precisely so that partial updates
 will work...

 https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates



 -Hoss



Re: In a functon query, I can't get the ValueSource when extend ValueSourceParser

2013-11-25 Thread sling
Thanks a lot for your reply, Chris.

I was trying to sort the query result by the Datefunction, by passing
q={!boost b=dateDeboost()}title:test to the /select request-handler.

Before, my custom DateFunction is like this:
public class DateFunction extends FieldCacheSource {
private static final long serialVersionUID = 6752223682280098130L;
private static long now;
public DateFunction(String field) {
super(field);
now = System.currentTimeMillis();
}
@Override
public FunctionValues getValues(Map context,
AtomicReaderContext readerContext) throws IOException {
long[] times = cache.getLongs(readerContext.reader(), field, 
false);
final float[] weights = new float[times.length];
for (int i = 0; i  times.length; i++) {
weights[i] = ScoreUtils.getNewsScoreFactor(now, 
times[i]);
}
return new FunctionValues() {
@Override
public float floatVal(int doc) {
return weights[doc];
}
};
}
}
It calculate every documet's date-weight, but at the same time , it only
need the one doc's date-weight, so it run slowly. 

When I see the source code of recip function in
org.apache.solr.search.ValueSourceParser, like this:
addParser(recip, new ValueSourceParser() {
  @Override
  public ValueSource parse(FunctionQParser fp) throws SyntaxError {
ValueSource source = fp.parseValueSource();
float m = fp.parseFloat();
float a = fp.parseFloat();
float b = fp.parseFloat();
return new ReciprocalFloatFunction(source, m, a, b);
  }
});
and in the ReciprocalFloatFunction, it get the value like this:
@Override
  public FunctionValues getValues(Map context, AtomicReaderContext
readerContext) throws IOException {
final FunctionValues vals = source.getValues(context, readerContext);
return new FloatDocValues(this) {
  @Override
  public float floatVal(int doc) {
return a/(m*vals.floatVal(doc) + b);
  }
  @Override
  public String toString(int doc) {
return Float.toString(a) + /(
+ m + *float( + vals.toString(doc) + ')'
+ '+' + b + ')';
  }
};
  }

So I think this is what I want. 
When calculate a doc's date-weight, I needn't cache.getLongs(x),
instead, I should source.getValues(xxx)

Therefore I change my code, but when fp.parseValueSource(), it throws an
error like this:
org.apache.solr.search.SyntaxError: Expected identifier at pos 12
str='dateDeboost()' 

Do I describe clearly this time?

Thanks again!

sling




--
View this message in context: 
http://lucene.472066.n3.nabble.com/In-a-functon-query-I-can-t-get-the-ValueSource-when-extend-ValueSourceParser-tp4103026p4103207.html
Sent from the Solr - User mailing list archive at Nabble.com.


Please help me to understand debugQuery output

2013-11-25 Thread Amit Aggarwal

Hello All,

Can any one help me in understanding debugQuery output like this.


lst name=explain

str

0.6276088 = (MATCH) sum of:

0.6276088 = (MATCH) max of:

0.18323982 = (MATCH) sum of:

	0.18323982 = (MATCH) weight(state_search:a in 327) [DefaultSimilarity], 
result of:


0.18323982 = score(doc=327,freq=2.0 = termFreq=2.0

), product of:

0.3188151 = queryWeight, product of:

3.2512918 = idf(docFreq=35, maxDocs=342)

0.098057985 = queryNorm

0.5747526 = fieldWeight in 327, product of:

1.4142135 = tf(freq=2.0), with freq of:

2.0 = termFreq=2.0

3.2512918 = idf(docFreq=35, maxDocs=342)

0.125 = fieldNorm(doc=327)

0.2505932 = (MATCH) sum of:

	0.2505932 = (MATCH) weight(country_search:a in 327) 
[DefaultSimilarity], result of:


0.2505932 = score(doc=327,freq=1.0 = termFreq=1.0

), product of:

0.3135134 = queryWeight, product of:

3.1972246 = idf(docFreq=37, maxDocs=342)

0.098057985 = queryNorm

0.79930615 = fieldWeight in 327, product of:

1.0 = tf(freq=1.0), with freq of:

1.0 = termFreq=1.0

3.1972246 = idf(docFreq=37, maxDocs=342)

0.25 = fieldNorm(doc=327)

0.25283098 = (MATCH) sum of:

	0.25283098 = (MATCH) weight(area_search:a in 327) [DefaultSimilarity], 
result of:


0.25283098 = score(doc=327,freq=1.0 = termFreq=1.0

), product of:

0.398 = queryWeight, product of:

4.06 = idf(docFreq=15, maxDocs=342)

0.098057985 = queryNorm

0.6347222 = fieldWeight in 327, product of:

1.0 = tf(freq=1.0), with freq of:

1.0 = termFreq=1.0

4.06 = idf(docFreq=15, maxDocs=342)

0.15625 = fieldNorm(doc=327)

0.6276088 = (MATCH) sum of:

	0.12957011 = (MATCH) weight(city_search:a in 327) [DefaultSimilarity], 
result of:


0.12957011 = score(doc=327,freq=1.0 = termFreq=1.0

), product of:

0.3188151 = queryWeight, product of:

3.2512918 = idf(docFreq=35, maxDocs=342)

0.098057985 = queryNorm

0.40641147 = fieldWeight in 327, product of:

1.0 = tf(freq=1.0), with freq of:

1.0 = termFreq=1.0

3.2512918 = idf(docFreq=35, maxDocs=342)

0.125 = fieldNorm(doc=327)

	0.3638727 = (MATCH) weight(city_search:ab in 327) [DefaultSimilarity], 
result of:


0.3638727 = score(doc=327,freq=1.0 = termFreq=1.0

), product of:

0.5342705 = queryWeight, product of:

5.4485164 = idf(docFreq=3, maxDocs=342)

0.098057985 = queryNorm

0.68106455 = fieldWeight in 327, product of:

1.0 = tf(freq=1.0), with freq of:

1.0 = termFreq=1.0

5.4485164 = idf(docFreq=3, maxDocs=342)

0.125 = fieldNorm(doc=327)

	0.13416591 = (MATCH) weight(city_search:b in 327) [DefaultSimilarity], 
result of:


0.13416591 = score(doc=327,freq=1.0 = termFreq=1.0

), product of:

0.32441998 = queryWeight, product of:

3.3084502 = idf(docFreq=33, maxDocs=342)

0.098057985 = queryNorm

0.41355628 = fieldWeight in 327, product of:

1.0 = tf(freq=1.0), with freq of:

1.0 = termFreq=1.0

3.3084502 = idf(docFreq=33, maxDocs=342)

0.125 = fieldNorm(doc=327)

/str








Any links where this explaination is explained ?

Thanks

--
Amit Aggarwal
8095552012



Re: a function query of time, frequency and score.

2013-11-25 Thread sling
Thanks, Erick.
What I want to do is custom the sort by date, time, and number.
I want to know is there some formula to tackle this.

Thanks again!
sling


On Fri, Nov 22, 2013 at 9:11 PM, Erick Erickson [via Lucene] 
ml-node+s472066n4102599...@n3.nabble.com wrote:

 Not quite sure what you're asking. The field() function query brings the
 value of a field into the score, something like:
 http://localhost:8983/solr/select?wt=jsonfl=id%20scoreq={!boost%20b=field(popularity)}ipod


 Best,
 Erick


 On Thu, Nov 21, 2013 at 10:43 PM, sling [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4102599i=0
 wrote:

  Hi, guys.
 
  I indexed 1000 documents, which have fields like title, ptime and
  frequency.
 
  The title is a text fild, the ptime is a date field, and the frequency
 is a
  int field.
  Frequency field is ups and downs. say sometimes its value is 0, and
  sometimes its value is 999.
 
  Now, in my app, the query could work with function query well. The
 function
  query is implemented as the score multiplied by an decreased date-weight
  array.
 
  However, I have got no idea to add the frequency to this formula...
 
  so could someone give me a clue?
 
  Thanks again!
 
  sling
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/a-function-query-of-time-frequency-and-score-tp4102531.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/a-function-query-of-time-frequency-and-score-tp4102531p4102599.html
  To unsubscribe from a function query of time, frequency and score., click
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4102531code=c2xpbmczNThAZ21haWwuY29tfDQxMDI1MzF8NzMyOTA2Njg2
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/a-function-query-of-time-frequency-and-score-tp4102531p4103216.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: building custom cache - using lucene docids

2013-11-25 Thread Roman Chyla
OK, I've spent some time reading the solr/lucene4x classes, and this is
myunderstanding (feel free to correct me ;-))

DirectoryReader holds the opened segments -- each segment has its own
reader, the BaseCompositeReader (or extended classes thereof) store the
offsets per each segment; eg. [0, 5, 22] - meaning, there are 2 segments,
with 5, and 17 docs respectively

The segments are listed in the segments_N file,
http://lucene.apache.org/core/3_0_3/fileformats.html#Segments
File

So theoretically, order of segments could change when merge happens - yet,
every SegmentReader is identified by unique name and this name doesn't
change unless the segment itself changed (ie. docs were deleted; or got
more docs) - so it is possible to rely on this name to know what has not
changed

the name is coming from SegmentInfo (check its toString method) -- the
SegmentInfo has a method equals() that will consider as equal the readers
with the same name and the same dir (which is useful to know - two readers,
one with deletes, one without, are equal)

Lucene's FieldCache itself is rather complex, but it shows there is a very
clever mechanism (a few actually!) -- a class can register a listener that
will be called whenever an index segments is being closed (this could be
used to invalidate portions of a cache), the relevant classes are:
SegmentReader.CoreClosedListener, IndexReader.ReaderClosedListener

But Lucene is using this mechanism only to purge the cache - so
effectively, every commits triggers cache rebuild. This is the interesting
bit: lots of work could be spared if segments data were reused  (but
admittedly, only sometimes - for data that was fully read into memory, for
anything else, such as terms, the cache reads only some values and is
fetching the rest from the index - so Lucene must close the reader and
rebuild the cache on every commit; but that is not my case, as I am to copy
values from an index, and store them in memory...)

the weird 'recyclation' of docids I've observed can probably be explained
by the fact that the index reader contains segments and near realtime
readers (but I'm not sure about this)

To conclude: it is possible to build a cache that updates itself (with only
changes committed since the last build) - this will have impact on how fast
new searcher is ready to serve requests

HTH somebody else too :)

  roman



On Mon, Nov 25, 2013 at 7:54 PM, Roman Chyla roman.ch...@gmail.com wrote:




 On Mon, Nov 25, 2013 at 12:54 AM, Mikhail Khludnev 
 mkhlud...@griddynamics.com wrote:

 Roman,

 I don't fully understand your question. After segment is flushed it's
 never
 changed, hence segment-local docids are always the same. Due to merge
 segment can gone, its' docs become new ones in another segment.  This is
 true for 'global' (Solr-style) docnums, which can flip after merge is
 happened in the middle of the segments' chain.
 As well you are saying about segmented cache I can propose you to look at
 CachingWrapperFilter and NoOpRegenerator as a pattern for such data
 structures.


 Thanks Mikhail, the CWF confirms that the idea of regenerating just part
 of the cache is doable. The CacheRegenerators, on the other hand, make no
 sense to me - and they are not given any 'signals', so they don't know if
 they are in the middle of some regeneration or not, and they should not
 keep a state (of previous index) - as they can be shared by threads that
 build the cache

 Best,

   roman




 On Sat, Nov 23, 2013 at 9:40 AM, Roman Chyla roman.ch...@gmail.com
 wrote:

  Hi,
  docids are 'ephemeral', but i'd still like to build a search cache with
  them (they allow for the fastest joins).
 
  i'm seeing docids keep changing with updates (especially, in the last
 index
  segment) - as per
  https://issues.apache.org/jira/browse/LUCENE-2897
 
  That would be fine, because i could build the cache from diff (of index
  state) + reading the latest index segment in its entirety. But can I
 assume
  that docids in other segments (other than the last one) will be
 relatively
  stable? (ie. when an old doc is deleted, the docid is marked as removed;
  update doc = delete old  create a new docid)?
 
  thanks
 
  roman
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com





Re: New to Solr - Need advice on clustering

2013-11-25 Thread Anders Kåre Olsen


Hi Gora

Thank you for your reply.

We are planning on having a loadbalancer in front of our frontend servers.

If I have two distinct solr indexes, how will I keep them synchronized? I 
expect that one of the frontend servers will have the task of updating the 
product repository on the e-commerce site. This server will then update the 
local solr index after product update has finished.


Is there an easy  way that I can keep the two indexes synchronized without 
solrcloud?


Regards
Anders

-Oprindelig meddelelse- 
From: Gora Mohanty

Sent: Tuesday, November 26, 2013 2:37 AM
To: solr-user@lucene.apache.org
Subject: Re: New to Solr - Need advice on clustering

On 26 November 2013 01:44, Anders Kåre Olsen a...@mail.dk wrote:

Hi Solr-users

I’m trying to setup Solr for search and indexing on the project I’m 
working on.


My project is a e-commerce B2B solution. We are planning on setting up 2 
frontend servers for the website, and I was planning on installing Solr on 
these servers. We are using Windows Server 2012 for the frontend servers.


We are not expecting a huge load on the servers, so we expect these 2 
servers to be adequate to handle both the website and search index.


I have been looking at SolrCloud and ZooKeeper. Howver I have read that 
you need at least 3 ZooKeepers in an ensamble, and I only have 2 servers.


I need to handle the situation where one of the servers crashes, so I need 
both servers to have a Solr index.

[...]

If you do not want to get into SolrCloud, a simpler
solution might be a HTTP load balancer in front of
the two Solr instances. Hardware load balancers are
better, but more expensive. A software load balancer
like haproxy should meet your needs.

Regards,
Gora 



HttpSolrServer - Http Client Connection pooling issue

2013-11-25 Thread imgauravd
Hi,

Hopefully I am mailing the correct mailid for solr issue. If not then please 
let me know accordingly.

We are using Solr 4.3.1 and we are using HttpSolrServer for querying Solr.

We are trying to do a load and stress test using Jmeter and we can see that 
after certain requests Solr responds in very unusual way. It gets stuck and 
responds only after sometime. Upon checking the Http Connections we realized 
that there are so many open connections that are not closed.
My questions are:
1. Is there a way to do HTTP connection  pooling ?

Note that HttpSolrServer instance is static.

2. Can I configure Http Connections using solrconfig file ?

Any pointers would be very helpful.

--
Thanks,
Gaurav

Re: Setting solr.data.dir for SolrCloud instance

2013-11-25 Thread adfel70
Thanks for the reply, Erick.
Actually, I didnt not think this through. I just thought it would be a good
idea to separate the data from the application code.
I guess I'll leave it without setting the datadir parameter and add a
symlink.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Setting-solr-data-dir-for-SolrCloud-instance-tp4103052p4103228.html
Sent from the Solr - User mailing list archive at Nabble.com.


Storing solr results in excel

2013-11-25 Thread kumar
Hi,

i am getting two field values from excel and querying solr to give top 1
results. But i need to store the results in another excel sheet. Anyone help
me how to store solr results in excel file using solrj

Regards,
Kumar.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Storing-solr-results-in-excel-tp4103237.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Storing solr results in excel

2013-11-25 Thread Mikhail Khludnev
wt=csv ?


On Tue, Nov 26, 2013 at 11:09 AM, kumar pavan2...@gmail.com wrote:

 Hi,

 i am getting two field values from excel and querying solr to give top 1
 results. But i need to store the results in another excel sheet. Anyone
 help
 me how to store solr results in excel file using solrj

 Regards,
 Kumar.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Storing-solr-results-in-excel-tp4103237.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Please help me to understand debugQuery output

2013-11-25 Thread GaneshSe
You might want to look at the Solr Relavancy for the same.

http://wiki.apache.org/solr/SolrRelevancyFAQ
http://wiki.apache.org/solr/SolrRelevancyFAQ  

Also, it will even better if you look at the link above with outcome you
want to get to, like Want to know why this document is better than the
other or Why the document in my db did not come up?


Amit Aggarwal wrote
 Hello All,
 
 Can any one help me in understanding debugQuery output like this.
 
 lst name=explain
   
 str
   0.6276088 = (MATCH) sum of:
 
   0.6276088 = (MATCH) max of:
 
   0.18323982 = (MATCH) sum of:
 
   0.18323982 = (MATCH) weight(state_search:a in 327) [DefaultSimilarity], 
 result of:
 
   0.18323982 = score(doc=327,freq=2.0 = termFreq=2.0
 
   ), product of:
 
   0.3188151 = queryWeight, product of:
 
   3.2512918 = idf(docFreq=35, maxDocs=342)
 
   0.098057985 = queryNorm
 
   0.5747526 = fieldWeight in 327, product of:
 
   1.4142135 = tf(freq=2.0), with freq of:
 
   2.0 = termFreq=2.0
 
   3.2512918 = idf(docFreq=35, maxDocs=342)
 
   0.125 = fieldNorm(doc=327)
 
   0.2505932 = (MATCH) sum of:
 
   0.2505932 = (MATCH) weight(country_search:a in 327) 
 [DefaultSimilarity], result of:
 
   0.2505932 = score(doc=327,freq=1.0 = termFreq=1.0
 
   ), product of:
 
   0.3135134 = queryWeight, product of:
 
   3.1972246 = idf(docFreq=37, maxDocs=342)
 
   0.098057985 = queryNorm
 
   0.79930615 = fieldWeight in 327, product of:
 
   1.0 = tf(freq=1.0), with freq of:
 
   1.0 = termFreq=1.0
 
   3.1972246 = idf(docFreq=37, maxDocs=342)
 
   0.25 = fieldNorm(doc=327)
 
   0.25283098 = (MATCH) sum of:
 
   0.25283098 = (MATCH) weight(area_search:a in 327) [DefaultSimilarity], 
 result of:
 
   0.25283098 = score(doc=327,freq=1.0 = termFreq=1.0
 
   ), product of:
 
   0.398 = queryWeight, product of:
 
   4.06 = idf(docFreq=15, maxDocs=342)
 
   0.098057985 = queryNorm
 
   0.6347222 = fieldWeight in 327, product of:
 
   1.0 = tf(freq=1.0), with freq of:
 
   1.0 = termFreq=1.0
 
   4.06 = idf(docFreq=15, maxDocs=342)
 
   0.15625 = fieldNorm(doc=327)
 
   0.6276088 = (MATCH) sum of:
 
   0.12957011 = (MATCH) weight(city_search:a in 327) [DefaultSimilarity], 
 result of:
 
   0.12957011 = score(doc=327,freq=1.0 = termFreq=1.0
 
   ), product of:
 
   0.3188151 = queryWeight, product of:
 
   3.2512918 = idf(docFreq=35, maxDocs=342)
 
   0.098057985 = queryNorm
 
   0.40641147 = fieldWeight in 327, product of:
 
   1.0 = tf(freq=1.0), with freq of:
 
   1.0 = termFreq=1.0
 
   3.2512918 = idf(docFreq=35, maxDocs=342)
 
   0.125 = fieldNorm(doc=327)
 
   0.3638727 = (MATCH) weight(city_search:ab in 327) [DefaultSimilarity], 
 result of:
 
   0.3638727 = score(doc=327,freq=1.0 = termFreq=1.0
 
   ), product of:
 
   0.5342705 = queryWeight, product of:
 
   5.4485164 = idf(docFreq=3, maxDocs=342)
 
   0.098057985 = queryNorm
 
   0.68106455 = fieldWeight in 327, product of:
 
   1.0 = tf(freq=1.0), with freq of:
 
   1.0 = termFreq=1.0
 
   5.4485164 = idf(docFreq=3, maxDocs=342)
 
   0.125 = fieldNorm(doc=327)
 
   0.13416591 = (MATCH) weight(city_search:b in 327) [DefaultSimilarity], 
 result of:
 
   0.13416591 = score(doc=327,freq=1.0 = termFreq=1.0
 
   ), product of:
 
   0.32441998 = queryWeight, product of:
 
   3.3084502 = idf(docFreq=33, maxDocs=342)
 
   0.098057985 = queryNorm
 
   0.41355628 = fieldWeight in 327, product of:
 
   1.0 = tf(freq=1.0), with freq of:
 
   1.0 = termFreq=1.0
 
   3.3084502 = idf(docFreq=33, maxDocs=342)
 
   0.125 = fieldNorm(doc=327)
 
   
 /str
   
 
   
 
 
 
 
 Any links where this explaination is explained ?
 
 Thanks
 
 -- 
 Amit Aggarwal
 8095552012





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Please-help-me-to-understand-debugQuery-output-tp4103210p4103241.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: New to Solr - Need advice on clustering

2013-11-25 Thread Sameer Maggon
Anders,

Take a look at Solr Replication. Essentially, you'll treat one as a master
 one as a slave. Both master  slave can be used to serve traffic. If one
of them goes down, the other can be used as a master for the interim.

http://wiki.apache.org/solr/SolrReplication

Sameer.
--
http://measuredsearch.com


On Mon, Nov 25, 2013 at 9:50 PM, Anders Kåre Olsen a...@mail.dk wrote:


 Hi Gora

 Thank you for your reply.

 We are planning on having a loadbalancer in front of our frontend servers.

 If I have two distinct solr indexes, how will I keep them synchronized? I
 expect that one of the frontend servers will have the task of updating the
 product repository on the e-commerce site. This server will then update the
 local solr index after product update has finished.

 Is there an easy  way that I can keep the two indexes synchronized without
 solrcloud?

 Regards
 Anders

 -Oprindelig meddelelse- From: Gora Mohanty
 Sent: Tuesday, November 26, 2013 2:37 AM
 To: solr-user@lucene.apache.org
 Subject: Re: New to Solr - Need advice on clustering


 On 26 November 2013 01:44, Anders Kåre Olsen a...@mail.dk wrote:

 Hi Solr-users

 I’m trying to setup Solr for search and indexing on the project I’m
 working on.

 My project is a e-commerce B2B solution. We are planning on setting up 2
 frontend servers for the website, and I was planning on installing Solr on
 these servers. We are using Windows Server 2012 for the frontend servers.

 We are not expecting a huge load on the servers, so we expect these 2
 servers to be adequate to handle both the website and search index.

 I have been looking at SolrCloud and ZooKeeper. Howver I have read that
 you need at least 3 ZooKeepers in an ensamble, and I only have 2 servers.

 I need to handle the situation where one of the servers crashes, so I
 need both servers to have a Solr index.

 [...]

 If you do not want to get into SolrCloud, a simpler
 solution might be a HTTP load balancer in front of
 the two Solr instances. Hardware load balancers are
 better, but more expensive. A software load balancer
 like haproxy should meet your needs.

 Regards,
 Gora