Re: Urgent- General Question about document Indexing frequency in solr

2021-02-04 Thread Scott Stults
Manisha, The most general recommendation around commits is to not explicitly commit after every update. There are settings that will let Solr automatically commit after some threshold is met, and by delegating commits to that mechanism you can generally ingest faster. See this blog post that

Urgent- General Question about document Indexing frequency in solr

2021-02-03 Thread Manisha Rahatadkar
Hi All Looking for some help on document indexing frequency. I am using apache solr 7.7 and SolrNet library to commit documents to Solr. Summary for this function is: // Summary: // Commits posted documents, blocking until index changes are flushed to disk and // blocking until a new

Question: JavaBinCodec cannot handle BytesRef object

2021-01-12 Thread Boqi Gao
Dear all: We are facing a problem recently when we are utilizing BinaryDocValueField of solr 7.3.1. We have created a binary docValue field. The constructor of BinaryDocValuesField(String name, BytesRef value) needs a BytesRef object to be set as its fieldData. However, the JavaBinCodec cannot

Re: Question on solr metrics

2020-10-27 Thread Emir Arnautović
Hi, In order to see time range metrics, you’ll need to collect metrics periodically and send it to some storage and then query/visualise. Solr has exporters for some popular backends, or you can use some cloud based solution. One such solution is our:

Question on solr metrics

2020-10-26 Thread yaswanth kumar
Can we get the metrics for a particular time range? I know metrics history was not enabled, so that I will be having only from when the solr node is up and running last time, but even from it can we do a data range like for example on to see CPU usage on a particular time range? Note: Solr

Re: Question on metric values

2020-10-26 Thread Andrzej Białecki
The “requests” metric is a simple counter. Please see the documentation in the Reference Guide on the available metrics and their meaning. This counter is initialised when the replica starts up, and it’s not persisted (so if you restart this Solr node it will reset to 0). If by “frequency”

Question on metric values

2020-10-26 Thread yaswanth kumar
I am new to metrics api in solr , when I try to do solr/admin/metrics?prefix=QUERY./select.requests its throwing numbers against each collection that I have, I can understand those are the requests coming in against each collection, but for how much frequencies?? Like are those numbers from the

Re: TieredMergePolicyFactory question

2020-10-26 Thread Moulay Hicham
Thanks Shawn and Erick. So far I haven't noticed any performance issues before and after the change. My concern all along is COST. We could have left the configuration as is - keeping the deleting documents in the index - But we have to scale up our Solr cluster. This will double our Solr

Re: TieredMergePolicyFactory question

2020-10-26 Thread Erick Erickson
"Some large segments were merged into 12GB segments and deleted documents were physically removed.” and “So with the current natural merge strategy, I need to update solrconfig.xml and increase the maxMergedSegmentMB often" I strongly recommend you do not continue down this path. You’re making a

Re: TieredMergePolicyFactory question

2020-10-26 Thread Shawn Heisey
On 10/25/2020 11:22 PM, Moulay Hicham wrote: I am wondering about 3 other things: 1 - You mentioned that I need free disk space. Just to make sure that we are talking about disc space here. RAM can still remain at the same size? My current RAM size is Index size < RAM < 1.5 Index size You

Re: TieredMergePolicyFactory question

2020-10-25 Thread Moulay Hicham
l? > Depending on the environment, you may not even be able to measure > performance changes so this all may be irrelevant anyway. > > But to your question. Yes, you can cause regular merging to more > aggressively > merge segments with deleted docs by setting the > deletesPctAl

Re: TieredMergePolicyFactory question

2020-10-23 Thread Erick Erickson
that this is worth any effort at all? Depending on the environment, you may not even be able to measure performance changes so this all may be irrelevant anyway. But to your question. Yes, you can cause regular merging to more aggressively merge segments with deleted docs by setting the deletesPctAllowed

Re: TieredMergePolicyFactory question

2020-10-23 Thread Moulay Hicham
Thanks Eric. My index is near real time and frequently updated. I checked this page https://lucene.apache.org/solr/guide/8_1/uploading-data-with-index-handlers.html#xml-update-commands and using forceMerge/expungeDeletes are NOT recommended. So I was hoping that the change in mergePolicyFactory

Re: TieredMergePolicyFactory question

2020-10-23 Thread Erick Erickson
Just go ahead and optimize/forceMerge, but do _not_ optimize to one segment. Or you can expungeDeletes, that will rewrite all segments with more than 10% deleted docs. As of Solr 7.5, these operations respect the 5G limit. See: https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/

TieredMergePolicyFactory question

2020-10-23 Thread Moulay Hicham
Hi, I am using solr 8.1 in production. We have about 30%-50% of deleted documents in some old segments that were merged a year ago. These segments size is about 5GB. I was wondering why these segments have a high % of deleted docs and found out that they are NOT being candidates for merging

Re: Question about solr commits

2020-10-08 Thread Erick Erickson
This is a bit confused. There will be only one timer that starts at time T when the first doc comes in. At T+ 15 seconds, all docs that have been received since time T will be committed. The first doc to hit Solr _after_ T+15 seconds starts a single new timer and the process repeats. Best, rick

Re: Question about solr commits

2020-10-08 Thread Rahul Goswami
Shawn, So if the autoCommit interval is 15 seconds, and one update request arrives at t=0 and another at t=10 seconds, then will there be two timers one expiring at t=15 and another at t=25 seconds, but this would amount to ONLY ONE commit at t=15 since that one would include changes from both

Re: Question about solr commits

2020-10-07 Thread yaswanth kumar
Thank you very much both Eric and Shawn Sent from my iPhone > On Oct 7, 2020, at 10:41 PM, Shawn Heisey wrote: > > On 10/7/2020 4:40 PM, yaswanth kumar wrote: >> I have the below in my solrconfig.xml >> >> >> ${solr.Data.dir:} >> >> >>

Re: Question about solr commits

2020-10-07 Thread Shawn Heisey
On 10/7/2020 4:40 PM, yaswanth kumar wrote: I have the below in my solrconfig.xml ${solr.Data.dir:} ${solr.autoCommit.maxTime:6} false ${solr.autoSoftCommit.maxTime:5000} Does this mean even though we are always sending

Re: Question about solr commits

2020-10-07 Thread Erick Erickson
Yes. > On Oct 7, 2020, at 6:40 PM, yaswanth kumar wrote: > > I have the below in my solrconfig.xml > > > > ${solr.Data.dir:} > > > ${solr.autoCommit.maxTime:6} > false > > > ${solr.autoSoftCommit.maxTime:5000} > > > > Does this mean even

Question about solr commits

2020-10-07 Thread yaswanth kumar
I have the below in my solrconfig.xml ${solr.Data.dir:} ${solr.autoCommit.maxTime:6} false ${solr.autoSoftCommit.maxTime:5000} Does this mean even though we are always sending data with commit=false on update solr api, the above

Re: Solr 7.6 query performace question

2020-10-01 Thread raj.yadav
harjags wrote > Below errors are very common in 7.6 and we have solr nodes failing with > tanking memory. > > The request took too long to iterate over terms. Timeout: timeoutAt: > 162874656583645 (System.nanoTime(): 162874701942020), >

Quick Question

2020-09-02 Thread William Morin
Hi, I was looking for some articles to read about "Schema Markup" today when I stumbled on your [ https://cwiki.apache.org/confluence/display/SOLR/UsingMailingLists ]. Very cool. Anyway, I noticed that there a text in your blog "Schema Markup" and luckily it's my keyword. I hope if you don't

Re: Question on sorting

2020-07-23 Thread Saurabh Sharma
Hi, It is because field is string and numbers are getting sorted lexicographically.It has nothing to do with number of digits. Thanks Saurabh On Thu, Jul 23, 2020, 11:24 AM Srinivas Kashyap wrote: > Hello, > > I have schema and field definition as shown below: > > omitNorms="true"/> > > >

Question on sorting

2020-07-22 Thread Srinivas Kashyap
Hello, I have schema and field definition as shown below: TRACK_ID field contains "NUMERIC VALUE". When I use sort on track_id (TRACK_ID desc) it is not working properly. ->I have below values in Track_ID Doc1: "84806" Doc2: "124561" Ideally, when I use sort command, query result should

Re: Question regarding replica leader

2020-07-20 Thread Vishal Vaibhav
So how do we recover from such state ? When I am trying addreplica , it returns me 503. Also my node has multiple replicas out of them most are dead. How do we make get rid of those dead replicas via script. ?is that a possibility? On Mon, 20 Jul 2020 at 11:00 AM, Radu Gheorghe wrote: > Hi

Re: Question regarding replica leader

2020-07-19 Thread Radu Gheorghe
Hi Vishal, I think that’s true, yes. The cluster has a leader (overseer), but this particular shard doesn’t seem to have a leader (yet). Logs should give you some pointers about why this happens (it may be, for example, that each replica is waiting for the other to become a leader, because

Re: Question regarding replica leader

2020-07-19 Thread Vishal Vaibhav
Hi any pointers on this ? On Wed, 15 Jul 2020 at 11:13 AM, Vishal Vaibhav wrote: > Hi Solr folks, > > I am using solr cloud 8.4.1 . I am using* > `/solr/admin/collections?action=CLUSTERSTATUS`*. Hitting this endpoint I > get a list of replicas in which one is active but neither of them is >

Question regarding replica leader

2020-07-14 Thread Vishal Vaibhav
Hi Solr folks, I am using solr cloud 8.4.1 . I am using* `/solr/admin/collections?action=CLUSTERSTATUS`*. Hitting this endpoint I get a list of replicas in which one is active but neither of them is leader. Something like this "core_node72": {"core": "rules_shard1_replica_n71","base_url":

Re: eDismax query syntax question

2020-06-16 Thread Shawn Heisey
On 6/15/2020 8:01 AM, Webster Homer wrote: Only the minus following the parenthesis is treated as a NOT. Are parentheses special? They're not mentioned in the eDismax documentation. Yes, parentheses are special to edismax. They are used just like in math equations, to group and separate

Re: eDismax query syntax question

2020-06-15 Thread Mikhail Khludnev
he reference, but that doesn't answer my question. If - is a > special character, it's not consistently special. In my example > "3-DIMETHYL" behaves quite differently than ")-PYRIMIDINE". If I escape > the closing parenthesis the following minus no longer behav

Re: eDismax query syntax question

2020-06-15 Thread Andrea Gazzarini
he reference, but that doesn't answer my question. If - is a > special character, it's not consistently special. In my example > "3-DIMETHYL" behaves quite differently than ")-PYRIMIDINE". If I escape > the closing parenthesis the following minus no longer behaves specially. >

Re: Question about Atomic Update

2020-06-15 Thread david . davila
Tecnologías de Análisis de la Información e Investigación del Fraude Teléfono: 915828763 Extensión: 36763 De: "Erick Erickson" Para: solr-user@lucene.apache.org Fecha: 15/06/2020 14:27 Asunto: Re: Question about Atomic Update All Atomic Updates do is 1> read all the

RE: eDismax query syntax question

2020-06-15 Thread Webster Homer
Markus, Thanks, for the reference, but that doesn't answer my question. If - is a special character, it's not consistently special. In my example "3-DIMETHYL" behaves quite differently than ")-PYRIMIDINE". If I escape the closing parenthesis the following minus no long

Re: Question about Atomic Update

2020-06-15 Thread Erick Erickson
result set. It’s not clear whether you search the text field, but if not you can store it somewhere else and only fetch it as needed. Best, Erick > On Jun 15, 2020, at 7:55 AM, david.dav...@correo.aeat.es wrote: > > Hi, > > I have a question related with atomic update in Solr. &

Question about Atomic Update

2020-06-15 Thread david . davila
Hi, I have a question related with atomic update in Solr. In our collection, documents have a lot of fields, most of them small. However, there is one of them that includes the text of the document. Sometimes, not many fortunatelly, this text is very long, more than 3 or 4 MB of plain text

RE: eDismax query syntax question

2020-06-13 Thread Markus Jelsma
To: solr-user@lucene.apache.org > Subject: eDismax query syntax question > > Recently we found strange behavior in a query. We use eDismax as the query > parser. > > This is the query term: > 1,3-DIMETHYL-5-(3-PHENYL-ALLYLIDENE)-PYRIMIDINE-2,4,6-TRIONE > > It should hi

eDismax query syntax question

2020-06-12 Thread Webster Homer
Recently we found strange behavior in a query. We use eDismax as the query parser. This is the query term: 1,3-DIMETHYL-5-(3-PHENYL-ALLYLIDENE)-PYRIMIDINE-2,4,6-TRIONE It should hit one document in our index. It does not. However, if you use the Dismax query parser it does match the record.

Re: question about setup for maximizing solr performance

2020-06-01 Thread Shawn Heisey
On 6/1/2020 9:29 AM, Odysci wrote: Hi, I'm looking for some advice on improving performance of our solr setup. Does anyone have any insights on what would be better for maximizing throughput on multiple searches being done at the same time? thanks! In almost all cases, adding memory will

question about setup for maximizing solr performance

2020-06-01 Thread Odysci
Hi, I'm looking for some advice on improving performance of our solr setup. In particular, about the trade-offs between applying larger machines, vs more smaller machines. Our full index has just over 100 million docs, and we do almost all searches using fq's (with q=*:*) and facets. We are using

Question for SOLR-14471

2020-05-26 Thread Kayak28
Hello, Solr community members: I am working on translating Solr's release note every release. Now, I am not clear about what SOLR-14471 actually fixes. URL for SOLR-14471: https://issues.apache.org/jira/browse/SOLR-14471 My questions are the following. - what does "all inherently equivalent

Re: LTR - FieldValueFeature Question

2020-04-26 Thread Dmitry Paramzin
It seems that in order to be available for FieldValueFeature score calculation, the field should be 'stored', otherwise it is not present in the document. It is also seems that indexed/docValue does not matter: final IndexableField indexableField = document.getField(field);

LTR - FieldValueFeature Question

2020-04-24 Thread Ashwin Ramesh
Hi everybody, Do we need to have 'indexed=true' to be able to retrieve the value of a field via FieldValueFeature or is having docValue=true enough? Currently, we have some dynamic fields as [dynamicField=true, stored=false, indexed=false, docValue=true]. However when we noticing that the value

Re: A question about underscore

2020-04-06 Thread Erick Erickson
I _strongly_ urge you to become acquainted with the Admin UI, particularly the “analysis” section. It’ll show you exactly what transformations each step in your analysis chain perform. Without you providing the fieldType definition, all I can do is guess but my guess is that you have

A question about underscore

2020-04-06 Thread chalaulait 808
I am using Solr4.0.13 to implement the search function of the document management system. I am currently having issues with search results when the search string contains an underscore. For example, if I search for the character string "AAA_001", the search results will return results like "AAA"

Autoscaling question

2020-03-26 Thread Kudrettin Güleryüz
Hi, I'd like to balance freedisk and cores across eight nodes. Here is my cluster-preferences and cluster-policy: { "responseHeader":{ "status":0, "QTime":0}, "cluster-preferences":[{ "precision":10, "maximize":"freedisk"} ,{ "minimize":"cores",

Disastor Scenario Question Regarding Tlog+pull solrcloud setup

2020-03-04 Thread Sandeep Dharembra
Hi, My question is about the solrcloud cluster we are trying to have. We have a collection with Tlog and pull type replicas. We intend to keep all the tlogs on one node and use that for writing and pull replicas distributed on the remaining nodes. What we have noticed is that when the Tlog node

Question About Solr Query Parser

2020-03-02 Thread Kayak28
Hello, Community: I have a question about interpreting a parsed query from Debug Query. I used Solr 8.4.1 and LuceneQueryParser. I was learning the behavior of ManagedSynonymFilter because I was curious about how "ManagedSynonymGraphFilter" fails to generate a graph. So, I try to

Re: Solr Cloud Question

2020-02-24 Thread Erick Erickson
; data indexed on the nodes and I am able to query the data from my website. > > The question I have is what impact it will have for me to stop one of the > solr cloud nodes and then restart it. I want to test if my alarms are right > or not. > > Thank you >

Solr Cloud Question

2020-02-24 Thread Kevin Sante
from my website. The question I have is what impact it will have for me to stop one of the solr cloud nodes and then restart it. I want to test if my alarms are right or not. Thank you

Re: A question about solr filter cache

2020-02-18 Thread Erick Erickson
uary 18, 2020 15:27 > To: solr-user@lucene.apache.org > Subject: RE: A question about solr filter cache > > Hi! > Yes, it may depends on Solr version > Solr 8.3 Admin filterCache page stats looks like: > > stats: > CACHE.searcher.filterCache.cleanupThread:

Re: A question about solr filter cache

2020-02-17 Thread Hongxu Ma
@Vadim Ivanov<mailto:vadim.iva...@spb.ntk-intourist.ru> Thank you! From: Vadim Ivanov Sent: Tuesday, February 18, 2020 15:27 To: solr-user@lucene.apache.org Subject: RE: A question about solr filter cache Hi! Yes, it may depends on Solr version Solr 8.3

RE: A question about solr filter cache

2020-02-17 Thread Vadim Ivanov
o:inte...@outlook.com] > Sent: Tuesday, February 18, 2020 5:32 AM > To: solr-user@lucene.apache.org > Subject: Re: A question about solr filter cache > > @Erick Erickson<mailto:erickerick...@gmail.com> and @Mikhail Khludnev > > got it, the explanation is very cl

Re: A question about solr filter cache

2020-02-17 Thread Hongxu Ma
@Erick Erickson<mailto:erickerick...@gmail.com> and @Mikhail Khludnev got it, the explanation is very clear. Thank you for your help. From: Hongxu Ma Sent: Tuesday, February 18, 2020 10:22 To: Vadim Ivanov ; solr-user@lucene.apache.org Subject: Re: A qu

Re: A question about solr filter cache

2020-02-17 Thread Hongxu Ma
51 To: solr-user@lucene.apache.org Subject: RE: A question about solr filter cache You can easily check amount of RAM used by core filterCache in Admin UI: Choose core - Plugins/Stats - Cache - filterCache It shows useful information on configuration, statistics and current RAM usage by filte

Re: A question about solr filter cache

2020-02-17 Thread Erick Erickson
of current filtercaches in RAM > Core, for ex, with 10 mln docs uses 1.3 MB of Ram for every filterCache > > >> -Original Message- >> From: Hongxu Ma [mailto:inte...@outlook.com] >> Sent: Monday, February 17, 2020 12:13 PM >> To: solr-user@lucene.apache.org

RE: A question about solr filter cache

2020-02-17 Thread Vadim Ivanov
mln docs uses 1.3 MB of Ram for every filterCache > -Original Message- > From: Hongxu Ma [mailto:inte...@outlook.com] > Sent: Monday, February 17, 2020 12:13 PM > To: solr-user@lucene.apache.org > Subject: A question about solr filter cache > > Hi > I want to k

Re: A question about solr filter cache

2020-02-17 Thread Mikhail Khludnev
Hello, The former https://github.com/apache/lucene-solr/blob/188f620208012ba1d726b743c5934abf01988d57/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L84 More efficient sets (roaring and/or elias-fano, iirc) present in Lucene, but not yet being used in Solr. On Mon, Feb 17, 2020 at

Re: A question about solr filter cache

2020-02-17 Thread Nicolas Franck
If 1GB would make solr go out of memory by using a filter query cache, then it would have already happened during the initial upload of the solr documents. Imagine the amount of memory you need for one billion documents.. A filter cache would be the least of your problems. 1GB is small in

A question about solr filter cache

2020-02-17 Thread Hongxu Ma
Hi I want to know the internal of solr filter cache, especially its memory usage. I googled some pages: https://teaspoon-consulting.com/articles/solr-cache-tuning.html https://lucene.472066.n3.nabble.com/Solr-Filter-Cache-Size-td4120912.html (Erick Erickson's answer) All of them said its

Re: Question about the max num of solr node

2020-01-03 Thread Jörn Franke
Why do you want to set up so many? What are your designs in terms of volumes / no of documents etc? > Am 03.01.2020 um 10:32 schrieb Hongxu Ma : > > Hi community > I plan to set up a 128 host cluster: 2 solr nodes on each host. > But I have a little concern about whether solr can support so

Question about the max num of solr node

2020-01-03 Thread Hongxu Ma
Hi community I plan to set up a 128 host cluster: 2 solr nodes on each host. But I have a little concern about whether solr can support so many nodes. I searched on wiki and found: https://cwiki.apache.org/confluence/display/SOLR/2019-11+Meeting+on+SolrCloud+and+project+health "If you create

Re: A question of solr recovery

2019-12-12 Thread Hongxu Ma
replica. Thanks. From: Erick Erickson Sent: Thursday, December 12, 2019 22:49 To: Hongxu Ma Subject: Re: A question of solr recovery If you’re using TLOG/PULL replica types, then only changed segments are downloaded. That replication pattern has a very different

Re: A question of solr recovery

2019-12-12 Thread Shawn Heisey
On 12/12/2019 8:53 AM, Shawn Heisey wrote: I do not think the replication handler deals with tlog files at all. The transaction log capability did not exist when the replication handler was built. I may have mixed up your message with a different one. Looking back over this, I don't see any

Re: A question of solr recovery

2019-12-12 Thread Shawn Heisey
On 12/12/2019 3:37 AM, Hongxu Ma wrote: And I found my "full sync" log: "IndexFetcher Total time taken for download (fullCopy=true,bytesDownloaded=178161685180) : 4377 secs (40704063 bytes/sec) to NIOFSDirectory@..." A more question: Form the log, looks it downloaded all

Re: A question of solr recovery

2019-12-12 Thread Hongxu Ma
Thank you very much @Erick Erickson<mailto:erickerick...@gmail.com> It's very clear. And I found my "full sync" log: "IndexFetcher Total time taken for download (fullCopy=true,bytesDownloaded=178161685180) : 4377 secs (40704063 bytes/sec) to NIOFSDirectory@..." A m

Re: A question of solr recovery

2019-12-11 Thread Erick Erickson
g-transaction-logs-softcommit-and-commit-in-sorlcloud/ > > It mentioned in the recovery section: > "Replays the documents from its own tlog if < 100 new updates have been > received by the leader. " > > My question: what's the meaning of "updates"? comm

A question of solr recovery

2019-12-10 Thread Hongxu Ma
ments from its own tlog if < 100 new updates have been received by the leader. " My question: what's the meaning of "updates"? commits? or documents? I refered solr code but still not sure about it. Hope you can help, thanks.

Re: hi question about solr

2019-12-03 Thread Paras Lehana
That's not my question. It's a suggestion. I was asking if Highlighting could fulfill your requirement? On Tue, 3 Dec 2019 at 17:31, Bernd Fehling wrote: > No, I don't use any highlighting. > > Am 03.12.19 um 12:28 schrieb Paras Lehana: > > Hi Bernd, > > > > Have yo

Re: hi question about solr

2019-12-03 Thread Bernd Fehling
No, I don't use any highlighting. Am 03.12.19 um 12:28 schrieb Paras Lehana: > Hi Bernd, > > Have you gone through Highlighting > ? > > On Mon, 2 Dec 2019 at 17:00, eli chen wrote: > >> yes >> >> On Mon, 2 Dec 2019 at 13:29, Bernd

Re: hi question about solr

2019-12-03 Thread Paras Lehana
Hi Bernd, Have you gone through Highlighting ? On Mon, 2 Dec 2019 at 17:00, eli chen wrote: > yes > > On Mon, 2 Dec 2019 at 13:29, Bernd Fehling > > wrote: > > > In short, > > > > you are trying to use an indexer as a full-text

Re: hi question about solr

2019-12-02 Thread eli chen
first of all thank you very much. i was looking for good resource to read on solr. i actually already tried the term vector. but for it to work i had to set the fl=content which response with the value of content field (which really really big)

Re: hi question about solr

2019-12-02 Thread Charlie Hull
Hi, https://livebook.manning.com/book/solr-in-action/chapter-3 may help (I'd suggest reading the whole book as well). Basically what you're looking for is the 'term position'. The TermVectorComponent in Solr will allow you to return this for each result. Cheers Charlie On 02/12/2019

Re: hi question about solr

2019-12-02 Thread eli chen
yes On Mon, 2 Dec 2019 at 13:29, Bernd Fehling wrote: > In short, > > you are trying to use an indexer as a full-text search engine, right? > > Regards > Bernd > > Am 02.12.19 um 12:24 schrieb eli chen: > > hi im kind of new to solr so please be patient > > > > i'll try to explain what do i

Re: hi question about solr

2019-12-02 Thread Bernd Fehling
In short, you are trying to use an indexer as a full-text search engine, right? Regards Bernd Am 02.12.19 um 12:24 schrieb eli chen: > hi im kind of new to solr so please be patient > > i'll try to explain what do i need and what im trying to do. > > we a have a lot of books content and we

hi question about solr

2019-12-02 Thread eli chen
hi im kind of new to solr so please be patient i'll try to explain what do i need and what im trying to do. we a have a lot of books content and we want to index them and allow search in the books. when someone search for a term i need to get back the position of matchen word in the book for

Re: Question about Luke

2019-11-20 Thread Tomoko Uchida
Hello, > Is it different from checkIndex -exorcise option? > (As far as I recently leaned, checkIndex -exorcise will delete unreadable > indices. ) If you mean desktop app Luke, "Repair" is just a wrapper of CheckIndex.exorciseIndex(). There is no difference between doing "Repair" from Luke GUI

Re: Question about startup memory usage

2019-11-14 Thread Shawn Heisey
On 11/14/2019 1:46 AM, Hongxu Ma wrote: Thank you @Shawn Heisey , you help me many times. My -xms=1G When restart solr, I can see the progress of memory increasing (from 1G to 9G, took near 10s). I have a guess: maybe solr is loading some needed files into heap

Re: Question about startup memory usage

2019-11-14 Thread Hongxu Ma
What's your thoughts? thanks. From: Shawn Heisey Sent: Thursday, November 14, 2019 1:15 To: solr-user@lucene.apache.org Subject: Re: Question about startup memory usage On 11/13/2019 2:03 AM, Hongxu Ma wrote: > I have a solr-cloud cluster with a big collecti

Re: Question about startup memory usage

2019-11-13 Thread Shawn Heisey
billion docs (but each doc is very small: only some bytes), total size 3TB. My question is: Is the 9G mem usage after startup normal? If so, I am worried that the follow up index/search operations will cause an OOM error. And how can I reduce the memory usage? Maybe I should introduce more host

Question about startup memory usage

2019-11-13 Thread Hongxu Ma
: only some bytes), total size 3TB. My question is: Is the 9G mem usage after startup normal? If so, I am worried that the follow up index/search operations will cause an OOM error. And how can I reduce the memory usage? Maybe I should introduce more host with nodes, but besides this, is there any

Re: Question about memory usage and file handling

2019-11-11 Thread Erick Erickson
(1) no. The internal Ram buffer will pretty much limit the amount of heap used however. (2) You actually have several segments. “.cfs” stands for “Compound File”, see: https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/codecs/lucene70/package-summary.html "An optional "virtual" file

Re: Question about memory usage and file handling

2019-11-11 Thread Shawn Heisey
On 11/11/2019 1:40 PM, siddharth teotia wrote: I have a few questions about Lucene indexing and file handling. It would be great if someone can help with these. I had earlier asked these questions on gene...@lucene.apache.org but was asked to seek help here. This mailing list (solr-user) is

Question about memory usage and file handling

2019-11-11 Thread siddharth teotia
Hi All, I have a few questions about Lucene indexing and file handling. It would be great if someone can help with these. I had earlier asked these questions on gene...@lucene.apache.org but was asked to seek help here. (1) During indexing, is there any knob to tell the writer to use off-heap

Question about Luke

2019-11-11 Thread Kayak28
Hello, Community: I am using Solr7.4.0 currently, and I was testing how Solr actually behaves when it has a corrupted index. And I used Luke to fix the broken index from GUI. I just came up with the following questions. Is it possible to use the repair index tool from CLI? (in the case, Solr was

Re: Solr 7.6 query performace question

2019-10-13 Thread Erick Erickson
Well, It Depends (tm). Certainly 2 and 3 are _not_ memory intensive 4 depends on the number of terms in the fields. But I suspect your real problem has nothing to do with memory and is <1>. Try q=*:* rather than q=*. In case your e-mail tries to make things bold, that’s

Solr 7.6 query performace question

2019-10-13 Thread harjags
We are upgrading to solr 7.6 from 6.1 Our query has below pattern predominantly 1.q is * as we filter based on a department of products always 2. 100+ bq's to boost certain document 3. Collapsing using a non DocValue field 4.Many Facet Fields and Many Facet queries Which of the above is the most

Re: Question regarding subqueries

2019-10-03 Thread Bram Biesbrouck
Hi Mikhail, You're right, I'm probably over-complicating things. I was stuck trying to combine a function in a regular query using a local variable, but Solr doesn't seem to bend the way my mind did ;-) Anyway, I worked around it using your suggestion and/or a slightly modified prefix parser

Re: Question regarding subqueries

2019-10-02 Thread Mikhail Khludnev
Hello, Bram. Something like that is possible in principle, but it will take enormous efforts to tackle exact syntax. Why not something like children.fq=-parent:true ? On Wed, Oct 2, 2019 at 8:52 PM Bram Biesbrouck < bram.biesbro...@reinvention.be> wrote: > Hi all, > > I'm struggling with a

Question regarding subqueries

2019-10-02 Thread Bram Biesbrouck
Hi all, I'm struggling with a little period-sign difficulty and instead of pulling out my hair, I wonder if any of you could help me out... Here's the query: q=uri:"/en/blah"=id,uri,children:[subquery]={!prefix f=id v=$ row.id}=* It just searches for a document with the field "uri" set to

Re: auto scaling question - solr 8.2.0

2019-09-26 Thread Joe Obernberger
Just as another data point.  I just tried again, and this time, I got an error from one of the remaining 3 nodes: Error while trying to recover. core=UNCLASS_2019_6_8_36_shard2_replica_n21:java.util.concurrent.ExecutionException: org.apache.solr.client.solrj.SolrServerException: IOException

auto scaling question - solr 8.2.0

2019-09-26 Thread Joe Obernberger
Hi all - I have a 4 node cluster for test, and created several solr collections with 2 shards and 2 replicas each. I'd like the global policy to be to not place more than one replica of the same shard on the same node.  I did this with this curl command: curl -X POST -H

ASCIIFoldingFilter question

2019-09-25 Thread Jarett Lear
Hope this is the right list to ask this, not sure if this is a bug or if I'm doing something wrong. We're running some text with some emojis through this filter and if I'm reading the code right when it finds a U+203C (:bangbang: | double exclamation) it replaces that with an appropriate !! ASCII

Re: Question about "No registered leader" error

2019-09-19 Thread Hongxu Ma
hen this error happens. Thanks again. From: Shawn Heisey Sent: Wednesday, September 18, 2019 20:21 To: solr-user@lucene.apache.org Subject: Re: Question about "No registered leader" error On 9/18/2019 6:11 AM, Shawn Heisey wrote: > On 9/17/2019 9:35 PM,

Re: Question about "No registered leader" error

2019-09-18 Thread Erick Erickson
> > pretty good. My G1 settings might do slightly better, but the > > improvement won't be dramatic unless your existing commandline has > > absolutely no gc tuning at all. > > That question will be important. If you already have our CMS GC tuning, > switching to G1 probably

Re: Question about "No registered leader" error

2019-09-18 Thread Shawn Heisey
olr do you have, and what is your max heap?  The CMS garbage collection that Solr 5.0 and later incorporate by default is pretty good.  My G1 settings might do slightly better, but the improvement won't be dramatic unless your existing commandline has absolutely no gc tuning at all. Tha

Re: Question about "No registered leader" error

2019-09-18 Thread Shawn Heisey
On 9/17/2019 9:35 PM, Hongxu Ma wrote: My questions: * Is this error possible caused by "long gc pause"? my solr zkClientTimeout=6 It's possible. I can't say for sure that this is the issue, but it might be. * If so, how can I prevent this error happen? My thoughts: using

Question about "No registered leader" error

2019-09-17 Thread Hongxu Ma
Hi all I got an error when I was doing index operation: "2019-09-18 02:35:44.427244 ... No registered leader was found after waiting for 4000ms , collection: foo slice: shard2" Beside it, there is no other error in solr log. Collection foo have 2 shards, then I check their jvm gc log: *

Re: Question: Solr perform well with thousands of replicas?

2019-09-04 Thread Hongxu Ma
ser. From: Erick Erickson Sent: Monday, September 2, 2019 21:20 To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform well with thousands of replicas? > why so many collection/replica: it's our customer needs, for example: each > database table

Re: Question: Solr perform well with thousands of replicas?

2019-09-02 Thread Erick Erickson
___ > From: Erick Erickson > Sent: Friday, August 30, 2019 20:05 > To: solr-user@lucene.apache.org > Subject: Re: Question: Solr perform well with thousands of replicas? > > “no registered leader” is the effect of some problem usually, not the root >

  1   2   3   4   5   6   7   8   9   10   >