On 2/7/2018 11:40 PM, Srinivas Kashyap wrote:
We have configured Solr index server on tomcat and fetch the data from database
to index the data. We have implemented delta query indexing based on modify_ts.
What version of Solr? Just as an FYI: Since version 5.0, running in
user-provided
Hi Deepthi,
Is dictionary static? Can value for some id change? If static, and if query
performance matters to you, the best and also the simplest solution is to
denormalise data and store dictionary values with docs.
Alternative is to use join query parser:
I have a dictionary of 2 ID's and their description which is in a
collection. There is another solr collection in which each document have 10
or more ID's(multi valued field). I would like to text search in the
dictionary and bring back the matched ID's and search these ID's in solr
Hello,
We have configured Solr index server on tomcat and fetch the data from database
to index the data. We have implemented delta query indexing based on modify_ts.
In our data-config.xml we have a parent entity and 17 child entity. We have 18
such solr cores. When we call delta-import on a
I have a dictionary of 2 ID's and their description which is in a
collection. There is another solr collection in which each document have 10
or more ID's(multi valued field). I would like to text search in the
dictionary and bring back the matched ID's and search these ID's in solr
Hi,
I am using a payload parser using Payload Score parser as below:
{!payload_score f=field v=$q func=max includeSpanScore=true}.
The issue is that the payload value in this field is around the range
1-1.
Due to this, the boosts added to other fields are never effective as
maximum of the
I am seeing that after some time hard commits in all my solr cores stop and
each one's searcher has an "opened at" date to be hours ago even though they
are continuing to ingesting data successfully (index size increasing
continuously).
@Emir : The 'sow' parameter in edismax along with the nested query
'_query_' works. Tuning has to be done for desired relevancy.
@Walter: It would be nice to have SOLR-629 integrated into the project. As
Emir suggested, _query_ caters to my need by by applying fuzzy parameter to
the query.
It can pretty much be used as-is, _except_
you'll find one or more entries in your request handlers like:
_text_
Change "_text_" to something in your schema, that's the default search
field if you don't field-qualify your search terms.
Note that if you take out, for instance, all of your
Hey Eric, thanks for the clarification! What about solrConfig.xml file?
Sure, it should be customized to suit one's needs but can it be used as a
base or is it best to create one from scratch ?
Thanks,
Pratik
On Wed, Feb 7, 2018 at 5:29 PM, Erick Erickson
wrote:
>
Agree with Walter, this is seeming like an XY problem. Also, Solr does
_not_ implement strict boolean logic, see:
https://lucidworks.com/2011/12/28/why-not-and-or-and-not/
Best,
Erick
On Wed, Feb 7, 2018 at 1:49 PM, Walter Underwood wrote:
> I understand what you are
That's really the point of the default managed-schema, to be a base
you use for your customizations. In fact, I often _remove_ most of the
fields (and especially fieldTypes) that I don't need. This includes
dynamic fields, copyFields and the like.
Sometimes it's actually easier, though, to just
Hello all,
I have added some fields to default managed-schema file. I was wondering if
it is safe to take default managed-schema file as is and add your own
fields to it in production. What is the best practice for this? As I
understand, it should be safe to use default schema as base if
I forgot to report back on this. For anyone that runs into it, you need
the entire data directory not just the index directory, at least that's
what made it work for me.
On Thu, Feb 1, 2018 at 9:52 PM, Erick Erickson
wrote:
> I think SCP will be fine. Shawn's comment
Hi,
I am using MoreLikeThis handler to get related documents for a given
document. To determine if I am getting good results or not, here is what I
do:
The same original document should be returned as a top match.
If it is not, then there is some problem with the relevancy.
Then, as same input
Thanks for replying Alessandro.
I am passing these parameters:
q=polt=polt=json=true=true=7=true=true=true=3=3=true=0.72
On Thu, Jan 25, 2018 at 4:28 AM, alessandro.benedetti
wrote:
> Can you tell us the request parameters used for the spellcheck ?
>
> In particular
Hello ,
I am attempting to tune my results that I retrieve from solr to boost
the importance of certain fields. The syntax of the query I am using is as
follows :
I understand what you are asking for. Solr doesn’t work like that. Solr is not
a programming language Short-circuit evaluation isn’t especially useful for a
search engine.
Most of the work is fetching and uncompressing the posting lists. Calculating
the score for each document is pretty fast.
Walter, It's just that I have a use case (to evaluate one field over other)
for which I am trying out multiple solutions in order to avoid making
multiple calls to SOLR.
I am trying to do a Short-circuit evaluation.
Short-circuit evaluation, minimal evaluation, or McCarthy evaluation (after
I’ve been messing around with the Solr 7.2 autoscaling framework this week.
Some things seem trivial, but I’m also running into questions and issues. If
anyone else has experience with this stuff, I’d be glad to hear it.
Specifically:
Context:
-One collection, consisting of 42 shards, where
You don’t get to control the order of execution, other than specifying a filter
query.
I think you have the wrong mental model of how Solr does search.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 7, 2018, at 1:28 PM, bbarani
Hello,
I am trying to use Payload fields to store per-zone delivery dates for
products. I have an index where my documents are products and for each product
we want to store a date by when we can deliver that product for 1-100 different
zones. Since the payload() function only supports int and
Thanks Erick. I will check this out.
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
You are right. I don't care about the score rather I want a document
containing specific term in a specific field to be evaluated first before
checking the next field.
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Yeah, sometimes the sugar-methods/classes in SolrJ lag a bit behind
the collections API. but at root about all these classes do is create
a ModifiableSolrParams with all the params you'd specify and make an
http call via the AsyncCollectionAdminRequest.process command last I
knew.
Best,
Erick
If you don't care about its contribution to scoring, one option is to
move the clause you want evaluated to an fq clause sitn {!cache=false
cost=101}. see: http://yonik.com/advanced-filter-caching-in-solr/
Best,
Erick
On Wed, Feb 7, 2018 at 12:05 PM, Emir Arnautović
On 2/7/2018 12:01 PM, Susheel Kumar wrote:
Just trying to find where do we set swap space available to Solr process. I
see in our 6.0 instances it was set to 2GB on and on 6.6 instances its set
to 16GB.
Solr has absolutely no involvement or control over swap space. Neither
does Java. This
Hi Susheel,
Swap space is OS thing, not Solr thing. You should see how to disable swap
space or at least set swappiness to some low number on your OS.
HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/
Hi,
Also note that score is different if only one term match and if both terms are
matched. Your case would make sense if you do not plan to order by score, but
as Walter explained, Solr does not go document by document and evaluate query
conditions, but it gets list of documents matching each
Hi Folks,
This Friday is the last day to submit abstracts and talks in around Solr and
Big Data Search. Could you please help reach out to others people in the Solr
community to get the word out?
Regards,
Ana Castro
[cid:image001.jpg@01D3A004.BEC85630]
Hi Folks,
DataWorks Summit San
Hello,
Just trying to find where do we set swap space available to Solr process. I
see in our 6.0 instances it was set to 2GB on and on 6.6 instances its set
to 16GB.
Thanks,
Susheel
That doesn’t really make sense for Solr query evaluation. It fetches the
posting lists for each term, then walks through them evaluating the query
against all the documents.
It can skip a document as soon as it fails the query, but it still has to fetch
the posting lists.
So, that feature
Hi,
I'm unable to find how I can do a MODIFYCOLLECTION via Solrj. I would
like to change the replication factor of a collection but can't find it
in the Solrj API. Is that not supported?
regards,
Hendrik
I am trying to figure out a way to form boolean (||) query in SOLR.
Ideally my expectation is that with boolean operator ||, if first term is
true second term shouldn't be evaluated.
=searchTerms:"testing" || matchStemming:"stemming"
works same as
=searchTerms:"testing" OR
It's possible to use highlighting over date fields ?
We've tried but we've got no highlighting response for the field.
On 2/7/2018 8:08 AM, Shawn Heisey wrote:
If your queries are producing the correct results,
then I will tell you that the "summary" part of your query example is
quite possibly completely unnecessary
After further thought, I have concluded that this part of what I said is
probably completely
I think you need the feature in SOLR-629 that adds fuzzy to edismax.
https://issues.apache.org/jira/browse/SOLR-629
The patch on that issue is for Solr 4.x, but I believe someone is working on a
new patch.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my
On 2/7/2018 5:20 AM, Maulin Rathod wrote:
Further analyzing issue we found that asking for too many rows (e.g.
rows=1000) can cause full GC problem as mentioned in below link.
This is because when you ask for 10 million rows, Solr allocates a
memory structure capable of storing
Thanks Webster,
I created https://issues.apache.org/jira/browse/SOLR-11955 to work on this.
--
Steve
www.lucidworks.com
> On Feb 6, 2018, at 2:47 PM, Webster Homer wrote:
>
> I noticed that in some of the current example schemas that are shipped with
> Solr, there is a
Just to clarify:
I can only cause this to happen when using the complexphrase query parser.
Lucene/dismax/edismax parsers are not affected.
2018-02-07 13:09 GMT+01:00 Bjarke Buur Mortensen :
> Hello list,
>
> Whenever I make a query for ** (two consecutive wildcards) it
Hi Maulin,
I'll chime in by referring to my own findings when analyzing Solr
performance:
https://www.mail-archive.com/solr-user@lucene.apache.org/msg135857.html
Yonik has a good article about paging:
http://yonik.com/solr/paging-and-deep-paging/. While it's about deep
paging, the same
Hi Erick,
Thanks for your response. It shows GC pauses in Solr GC logs (refer below solr
gc log where it shows 138.4138211 sec pause).
Seems like some bad query causes high memory allocation.
Further analyzing issue we found that asking for too many rows (e.g.
rows=1000) can cause
Hello list,
Whenever I make a query for ** (two consecutive wildcards) it causes my
Solr to run out of memory.
http://localhost:8983/solr/select?q=**
Why is that?
I realize that this is not a reasonable query to make, but the system
supports input from users, and they might by accident input
Hi Sravan,
Edismax has ’sow’ parameter that results in edismax to pass query to field
analysis, but not sure how it will work with fuzzy search. What you might do is
use _query synthax to separate shingle and non shingle queries, e.g.
q=_query({!edismax sow=false qf=title_bigrams}$v) OR
We have the following two fields for our movie title search
- title without symbols
a custom analyser with WordDelimiterFilterFactory, SynonymFilterFactory and
other filters to retain only alpha numeric characters.
- title with word bi grams
a custom analyser with solr.ShingleFilterFactory to
> Maybe this is the issue:
https://github.com/eclipse/jetty.project/issues/2169
Looks like it is the issue. (I've readacted IP addresses below for security
reasons)
solr [ /opt/solr ]$ netstat -ptan | awk '{print $6 " " $7 }' | sort | uniq
-c
8425 CLOSE_WAIT -
92 ESTABLISHED -
1
46 matches
Mail list logo