Basically, I think about using SolrCloud whenever you have to split
your corpus into more than one core (shard in SolrCloud terms). Or
when you require fault tolerance in terms of machines going up and
down.
Despite the name, it does _not_ require AWS or similar, and you can
run SolrCloud on a
Hi Erick
they are on the same JVM. I had already tried the core join strategy but
that doesnt solve the faceting problem... i.e if i have 2 cores, core0 and
core1, and I run this query on core0
/select?q=QUERYfq={!join from=id1 to=id2
fromIndex=core1}facet=truefacet.field=tag
has 2 problems
1)
Specify the join query parser for the main query. See:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser
-- Jack Krupansky
On Wed, Jun 3, 2015 at 3:32 PM, Robust Links pey...@robustlinks.com wrote:
Hi Erick
they are on the same JVM. I had already
BTW, does anybody know how SolrCloud got that name? I mean, SolrCluster
would make a lot more sense since a cloud is typically a very large
collection of machines and more of a place than a specific configuration,
while a Solr deployment is more typically a more modest number of machines,
a
Yes adding _solr worked, thx. But I also had to populate the SOLR_HOST param
for each of the 4 hosts, as in
SOLR_HOST=ec2-52-4-232-216.compute-1.amazonaws.com. I'm in an EC2 VPN
environment which might be the problem.
This command now works (leaving off port)
On 6/3/2015 2:19 PM, Jack Krupansky wrote:
BTW, does anybody know how SolrCloud got that name? I mean, SolrCluster
would make a lot more sense since a cloud is typically a very large
collection of machines and more of a place than a specific configuration,
while a Solr deployment is more
that doesnt work either, and even if it did, joining is not going to be a
solution since i cant query 1 core and facet on the result of the other. To
sum up, my problem is
core0
field:id
field: text
core1
field:id
field tag
I want to
1) query text field of core0,
2) use the
I took a quick look at the code and it _looks_ like any string
starting with t, T or 1 is evaluated as true and everything else
as false.
sortMissingLast determines sort order if you're sorting on this field
and the document doesn't have a value. Should the be sorted after or
before docs that
Hi everyone,
This is a two part question:
1) I see the following: fieldType name=boolean class=solr.BoolField
sortMissingLast=true/
a) what does sortMissingLast do?
b) what kind of data is considered Boolean? TRUE, True, true, 1,
yes,, Yes, FALSE, etc.
2) When searching, what do I search on:
On 6/3/2015 2:48 PM, tuxedomoon wrote:
Yes adding _solr worked, thx. But I also had to populate the SOLR_HOST param
for each of the 4 hosts, as in
SOLR_HOST=ec2-52-4-232-216.compute-1.amazonaws.com. I'm in an EC2 VPN
environment which might be the problem.
This command now works (leaving
Hi All - I've run into a problem where every-once in a while one or more
of the shards (27 shard cluster) will loose connection to zookeeper and
report updates are disabled. In additional to the CLUSTERSTATUS
timeout errors, which don't seem to cause any issue, this one certainly
does as that
Hi,
I wanted to know in detail on how it is http connections are handled in
Solr.
1. From my code, I am using CloudSolrServer of solrj client library to get
the connection. From one of my previous discussion in this forum, I
understood that Solr uses Apache's HttpClient for connections and the
On 6/3/2015 12:20 AM, Clemens Wyss DEV wrote:
Context: Lucene 5.1, Java 8 on debian. 24G of RAM whereof 16G available for
Solr.
I am seeing the following OOMs:
ERROR - 2015-06-03 05:17:13.317; [ customer-1-de_CH_1]
org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
Can you share you suggester configurations ?
Have you read the guide I linked ?
Has the suggestion index/fst has been built ? ( you need to build the
suggester)
Cheers
2015-06-03 4:07 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:
Thank you for your explanation.
I'll not need to care
Context: Lucene 5.1, Java 8 on debian. 24G of RAM whereof 16G available for
Solr.
I am seeing the following OOMs:
ERROR - 2015-06-03 05:17:13.317; [ customer-1-de_CH_1]
org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
java.lang.OutOfMemoryError: Java heap space
Thank you so much for your explanation.
On 2 June 2015 at 17:31, Alessandro Benedetti benedetti.ale...@gmail.com
wrote:
The scope in there is to try to make clustering lighter and more related to
the query.
The summary produced is a fragment that is surrounding the query terms in
the
Ciao Shawn,
thanks for your reply.
The oom script just kills Solr with the KILL signal (-9) and logs the kill.
I know. But my feeling is, that not even this happens, i.e. the script is not
being executed. At least I see no solr_oom_killer-$SOLR_PORT-$NOW.log file ...
Btw:
Who re-starts solr
This is my suggester configuration:
searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
str name=namesuggest/str
str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str
Explain a little about why you have separate cores, and how you decide
which core a new document should reside in. Your scenario still seems a bit
odd, so help us understand.
-- Jack Krupansky
On Wed, Jun 3, 2015 at 3:15 AM, Ксения Баталова batalova...@gmail.com
wrote:
Hi!
Thanks for your
If you are using stand-alone Solr instances, then it is your
responsibility to decide which node a document resides in, and thus to
which core you will send your update request.
If, however, you used SolrCloud, it would handle that for you - deciding
which node should contain a document, and
I think there are easier ways to do what you are trying to do.
Take a look at the Function query parser.
It will allow you to control the score for each document from within a
function query. The basic use case is this:
q={!func}myFunc()fq=my+query
In this scenario the func qparser plugin
The finish method would still be a problem using the func qparser.
Out of curiosity, why do you need to call close on the scorer?
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Jun 3, 2015 at 10:53 AM, Joel Bernstein joels...@gmail.com wrote:
I think there are easier ways to do what you
On 6/3/2015 4:12 AM, Manohar Sripada wrote:
1. From my code, I am using CloudSolrServer of solrj client library to get
the connection. From one of my previous discussion in this forum, I
understood that Solr uses Apache's HttpClient for connections and the
default maxConnections per host is 32
Hi Mark,
what exactly should I file? What needs to be added/appended to the issue?
Regards
Clemens
-Ursprüngliche Nachricht-
Von: Mark Miller [mailto:markrmil...@gmail.com]
Gesendet: Mittwoch, 3. Juni 2015 14:23
An: solr-user@lucene.apache.org
Betreff: Re: Solr OutOfMemory but no heap
I can see a lot of confusion in the configuration!
Few suggestions :
- read carefully the document and try to apply the suggesting guidance
- currently there is no need to use spellcheck for suggestions, now they
are separated things
- i see text used to derive suggestions, I would prefer there
File a JIRA issue please. That OOM Exception is getting wrapped in a
RuntimeException it looks. Bug.
- Mark
On Wed, Jun 3, 2015 at 2:20 AM Clemens Wyss DEV clemens...@mysign.ch
wrote:
Context: Lucene 5.1, Java 8 on debian. 24G of RAM whereof 16G available
for Solr.
I am seeing the following
Hi guys, need your help (again):
I have a search handler which need to override solr's scoring. I chose to
implement it with RankQuery API, so when getTopDocsCollector() gets called
it instantiates my TopDocsCollector instance, and every dicId gets its own
score:
public class MyScorerrankQuet
We will have to a find a way to deal with this long term. Browsing the code
I can see a variety of places where problem exception handling has been
introduced since this all was fixed.
- Mark
On Wed, Jun 3, 2015 at 8:19 AM Mark Miller markrmil...@gmail.com wrote:
File a JIRA issue please. That
Hi!
Thanks for your quick reply.
The problem that all my index is consists of several parts (several cores)
and while updating I don't know in advance in which part updated id is
lying (in which core the document with specified id is lying).
For example, I have two cores (*Core1 *and *Core2*)
I’m helping someone with this but my zookeeper experience is limited (as in
none). They have purportedly followed the instruction from the wiki.
https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble
Jun 02, 2015 2:40:37 PM
On 6/3/2015 1:41 AM, Clemens Wyss DEV wrote:
The oom script just kills Solr with the KILL signal (-9) and logs the kill.
I know. But my feeling is, that not even this happens, i.e. the script is
not being executed. At least I see no solr_oom_killer-$SOLR_PORT-$NOW.log
file ...
Btw:
Who
bq: what exactly should I file? What needs to be added/appended to the issue?
Just what Mark said, title it something like
OOM exception wrapped in runtime exception
Include your original post and that you were asked to open the JIRA
after discussion on the user's list. Don't worry too much, the
It's not entirely clear what you're trying to do when this is pushed
out, but I'm guessing it's create a collection. If that's so, then
this is your problem:
Could not find configName for collection client_active
You've set up Zookeeper correctly. But _before_ you create a
collection, you have
Thank you for your suggestions.
Will try that out and update on the results again.
Regards,
Edwin
On 3 June 2015 at 21:13, Alessandro Benedetti benedetti.ale...@gmail.com
wrote:
I can see a lot of confusion in the configuration!
Few suggestions :
- read carefully the document and try to
:
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-ThesortParameter
:
: I think we may have an omission from the docs -- docValues can also be
: used for sorting, and may also offer a performance advantage.
I added a note about that.
-Hoss
Hi
I have a set of document IDs from one core and i want to query another core
using the ids retrieved from the first core...the constraint is that the
size of doc ID set can be very large. I want to:
1) retrieve these docs from the 2nd index
2) facet on the results
I can think of 3 solutions:
Configure two suggesters, one based on each field. Use both of them and you’ll
get separate suggestions from each.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
On Jun 3, 2015, at 10:03 PM, Dhanesh Radhakrishnan dhan...@hifx.co.in wrote:
Hi
Anyone
Upayavira,
I'm using stand-alone Solr instances.
I've not learnt SolrCloud yet.
Please, give me some advice when SolrCloud is better then stand-alone
Solr instances.
Or when it is worth to choose SolrCloud.
_ _ _
Batalova Kseniya
If you are using stand-alone Solr instances, then it is your
what would be a custom solution?
On Wed, Jun 3, 2015 at 1:58 PM, Joel Bernstein joels...@gmail.com wrote:
You may have to do something custom to meet your needs.
10,000 DocID's is not huge but you're latency requirement are pretty low.
Are your DocID's by any chance integers? This can make
Erick makes a great point, if they are in the same VM try the cross-core
join first. It might be fast enough for you.
A custom solution would be to build a custom query or post filter that
works with your specific scenario. For example if the docID's are integers
you could build a fast PostFilter
My previous suggester configuration is derived from this page:
https://wiki.apache.org/solr/Suggester
Does it mean that what is written there is outdated?
Regards,
Edwin
On 3 June 2015 at 23:44, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote:
Thank you for your suggestions.
Will try that
Hi
Anyone help me to build a suggester auto complete based on multiple fields?
There are two fields in my schema. Category and Subcategory and I'm trying
to build suggester based on these 2 fields. When the suggestions result,
how can I distinguish from which filed it come from?
I used a
This may be helpful: http://lucidworks.com/blog/solr-suggester/
Note that there are a series of fixes in various versions of Solr,
particularly buildOnStartup=false and working on multivalued fields.
Best,
Erick
On Wed, Jun 3, 2015 at 8:04 PM, Zheng Lin Edwin Yeo
edwinye...@gmail.com wrote:
My
Thank you for the quick response.
If I use 2 suggesters, can I get the result in a single request?
http://192.17.80.99:8983/solr/core1/suggest?suggest=truesuggest.dictionary=mySuggesterwt=xmlsuggest.q=school
Is there any helping document to build multiple suggesters??
On Thu, Jun 4, 2015 at
Are these indexes on different machines? Because if they're in the
same JVM, you might be able to use cross-core joins. Be aware, though,
that joining on high-cardinality fields (which, by definition, docID
probably is) is where pseudo joins perform worst.
Have you considered flattening the data
I have to ask then why you're not using SolrCloud with multiple shards? It
seems to me that that gives you the indexing throughput you need (be sure to
use CloudSolrServer from your client). At 300M complex documents, you
pretty much certainly will need to shard anyway so in some sense you're
A few questions for you:
How large can the list of filtering ID's be?
What's your expectation on latency?
What version of Solr are you using?
SolrCloud or not?
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Jun 3, 2015 at 1:23 PM, Robust Links pey...@robustlinks.com wrote:
Hi
I
Hey Joel
see below
On Wed, Jun 3, 2015 at 1:43 PM, Joel Bernstein joels...@gmail.com wrote:
A few questions for you:
How large can the list of filtering ID's be?
10k
What's your expectation on latency?
10 latency 100
What version of Solr are you using?
5.0.0
SolrCloud or
You may have to do something custom to meet your needs.
10,000 DocID's is not huge but you're latency requirement are pretty low.
Are your DocID's by any chance integers? This can make custom PostFilters
run much faster.
You should also be aware of the Streaming API in Solr 5.1 which will give
Jack,
Decision of using several cores was made to increase indexing and
searching performance (experimentally).
In my project index is about 300-500 millions documents (each document
has rather difficult structure) and it may be larger.
So, while indexing the documents are being added in
50 matches
Mail list logo