Re: Need help on handling large size of index.

2020-05-21 Thread Modassar Ather
Thanks Shawn for your response. We have seen a performance increase in optimisation with a bigger number of IOPs. Without the IOPs we saw the optimisation took around 15-20 hours whereas the same index took 5-6 hours to optimise with higher IOPs. Yes the entire extra IOPs were never used to full

Re: Need help on handling large size of index.

2020-05-21 Thread Modassar Ather
Thanks Phill for your response. Optimal Index size: Depends on what you are optimizing for. Query Speed? Hardware utilization? We are optimising it for query speed. What I understand even if we set the merge policy to any number the amount of hard disk will still be required for the bigger

Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-21 Thread Jason Gerlowski
Hi Jay, I can't speak to why you're seeing a performance change between 6.x and 8.x. What I can suggest though is an alternative way of formulating the query: you might get different performance if you run your query using Solr's "terms" query parser:

Is it possible to direct queries to replicas in SolrCloud

2020-05-21 Thread Pushkar Raste
Hi, In master/slave we can send queries to slaves only, now that we have tlog and pull replicas can we send queries to those replicas to achieve similar scaling like master/slave for large search volumes? -- — Pushkar Raste

Re: Why Did It Match?

2020-05-21 Thread Doug Turnbull
Is your concern that the Solr explain functionality is slower than Endecas? Or harder to understand/interpret? If the latter, I might recommend http://splainer.io as one solution On Thu, May 21, 2020 at 4:52 PM Webster Homer < webster.ho...@milliporesigma.com> wrote: > My company is working on

Re: Is it possible to direct queries to replicas in SolrCloud

2020-05-21 Thread Erick Erickson
https://lucene.apache.org/solr/guide/7_7/distributed-requests.html > On May 21, 2020, at 5:40 PM, Pushkar Raste wrote: > > Hi, > In master/slave we can send queries to slaves only, now that we have tlog > and pull replicas can we send queries to those replicas to achieve similar > scaling like

Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-21 Thread Furkan KAMACI
Hi, How do you index that document? Do you index it with an empty *index_time_stamp_create* field as the second time too? Kind Regards, Furkan KAMACI On Fri, May 22, 2020 at 12:05 AM gnandre wrote: > Hi, > > Following is the update request processor chain. > > > < > processor

Require java 8 upgrade

2020-05-21 Thread Akhila John
Hi Team, We use solr 5.3.1 for sitecore 8.2. We require to upgrade Java version to 'Java 8 Update 251' and remove / Upgrade Wireshark to 3.2.3 in our application servers. Could you please advise if this would have any impact on the solr. Does solr 5.3.1 support Java 8. Thanks and regards,

Why Did It Match?

2020-05-21 Thread Webster Homer
My company is working on a new website. The old/current site is powered by Endeca. The site under development is powered by Solr (currently 7.7.2) Out of the box, Endeca provides the capability to show how a query was matched in the search. The business users like this functionality, in solr

Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-21 Thread gnandre
Hi, I do not pass that field at all. Here is the document that I index again and again to test through Solr Admin UI. { asset_id:"x:1", title:"x" } On Thu, May 21, 2020 at 5:25 PM Furkan KAMACI wrote: > Hi, > > How do you index that document? Do you index it with an empty >

Re: TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-21 Thread Furkan KAMACI
Hi, Do you have an id field for your documents? On the other hand, does your document count increases when you index it again? Kind Regards, Furkan KAMACI On Fri, May 22, 2020 at 1:03 AM gnandre wrote: > Hi, > > I do not pass that field at all. > > Here is the document that I index again and

Re: Require java 8 upgrade

2020-05-21 Thread Furkan KAMACI
Hi Akhila, Here is the related documentation: https://lucene.apache.org/solr/5_3_1/SYSTEM_REQUIREMENTS.html which says: "Apache Solr runs of Java 7 or greater, Java 8 is verified to be compatible and may bring some performance improvements. When using Oracle Java 7 or OpenJDK 7, be sure to not

Re: json faceting - Terms faceting and EnumField

2020-05-21 Thread Ponnuswamy, Poornima (GE Healthcare)
Can anyone provide some light on the issue I am having?. Thanks! On 5/20/20, 4:55 PM, "Ponnuswamy, Poornima (GE Healthcare)" wrote: Hello, We have solr 6.6 version. Below is the field and field type that is defined in solr schema. Below is the configuration

TimestampUpdateProcessorFactory updates the field even if the value if present

2020-05-21 Thread gnandre
Hi, Following is the update request processor chain. < processor class="solr.TimestampUpdateProcessorFactory"> index_time_stamp_create And, here is how the field is defined in schema.xml Every time I index the same document, above field changes its value with latest timestamp. According

Use Subquery Parameters to filter main query

2020-05-21 Thread rantonana
Hello, I need to do the following: I have a main query who define a subquery called group with "fields": "*,group:[subquery]", the group document has a lot of fields, but I want to filter the main query based on one of them. ex: { PID:1, type:doc, "group":{"numFound":1,"start":0,"docs":[

Re: Solr Atomic update change value and field name

2020-05-21 Thread Jan Høydahl
Try adding -format solr to your bin/post command. By default the post command will treat input as arbitrary json, not solr-format json. Jan Høydahl > 21. mai 2020 kl. 02:50 skrev Hup Chen : > > I am new to Solr. I tried to do Atomic update by using .json file update. > $SOLR/bin/post not

Re: Shingles behavior

2020-05-21 Thread Radu Gheorghe
Turns out, it’s down to setting enableGraphQueries=false in the field definition. I completely missed that :( > On 21 May 2020, at 07:49, Radu Gheorghe wrote: > > Hi Alex, long time no see :) > > I tried with sow, and that basically invalidates query-time shingles (it only > mathes mona OR

Re: How to restore deleted collection from filesystem

2020-05-21 Thread Erick Erickson
See inline. > On May 21, 2020, at 10:13 AM, Kommu, Vinodh K. wrote: > > Thanks Eric for quick response. > > Yes, our VMs are equipped with NetBackup which is like file based backup and > it can restore any files or directories that were deleted from latest > available full backup. > > Can

+(-...) vs +(*:* -...) vs -(+...)

2020-05-21 Thread Jochen Barth
Dear reader, why does +(-x_ss:y) finds 0 docs, while -(+x_ss:y) finds many docs? Ok... +(*:* -x_ss:y) works, too, but I'm a bit surprised. Kind regards, J. Barth

Re: Need help on handling large size of index.

2020-05-21 Thread Phill Campbell
The optimal size for a shard of the index is be definition what works best on the hardware with the JVM heap that is in use. More shards mean smaller sizes of the index for the shard as you already know. I spent months changing the sharing, the JVM heap, the GC values before taking the system

Re: +(-...) vs +(*:* -...) vs -(+...)

2020-05-21 Thread Houston Putman
Jochen, For the standard query parser, pure negative queries (no positive query in front of it, such as "*:*") are only allowed as a top level clause, so not nested within parenthesis. Check the second bullet point of the this section of the Ref Guide page for the Standard Query Parser.

Re: +(-...) vs +(*:* -...) vs -(+...)

2020-05-21 Thread Shawn Heisey
On 5/21/2020 12:25 PM, Jochen Barth wrote: why does +(-x_ss:y) finds 0 docs, while -(+x_ss:y) finds many docs? Ok... +(*:* -x_ss:y) works, too, but I'm a bit surprised. Purely negative queries, if that is what ultimately makes it to Lucene, do not work. The basic problem is that if you

Re: Does Solr master/slave support shard split

2020-05-21 Thread Erick Erickson
In a word, “no”. It’s a whole ’nother architecture to deal with shards, and stand-alone (i.e. master/slave) has no concept of that. You could make a single-shard collection in SolrCloud, copy the index to the right place (I’d shut down Solr while I copied it), and then use SPLITSHARD on it, but

Re: Unbalanced shard requests

2020-05-21 Thread Phill Campbell
Yes, JVM heap settings. > On May 19, 2020, at 10:59 AM, Wei wrote: > > Hi Phill, > > What is the RAM config you are referring to, JVM size? How is that related > to the load balancing, if each node has the same configuration? > > Thanks, > Wei > > On Mon, May 18, 2020 at 3:07 PM Phill

Re: Does Solr master/slave support shard split

2020-05-21 Thread Pushkar Raste
Thanks Eric. Moving to SolrCloud for splitting is what I too imagined  On Thu, May 21, 2020 at 1:28 PM Erick Erickson wrote: > In a word, “no”. It’s a whole ’nother architecture to deal > with shards, and stand-alone (i.e. master/slave) has no > concept of that. > > You could make a

Re: How to restore deleted collection from filesystem

2020-05-21 Thread Erick Erickson
So what I’m reading here is that you have the _data_ saved somewhere, right? By “data” I just mean the data directories under the replica. 1> Go ahead and recreate the collection. It _must_ have the same number of shards. Make it leader-only, i.e. replicationFactor == 1 2> The collection will

Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-21 Thread Jörn Franke
Did you create Solrconfig.xml for the collection from scratch after upgrading and reindexing? Was it based on the latest template? If not then please try this. Maybe also you need to increase the corresponding caches in the config. What happens if you reexecute the query? Are there other

How to restore deleted collection from filesystem

2020-05-21 Thread Kommu, Vinodh K.
Hi, One of our largest collection which holds 3.2 billion docs was deleted accidentally in QA environment. Unfortunately we don't have latest solr backup for this collection either to restore. The only option left for us is to restore deleted replica directories under data directory using

Re: Need help on handling large size of index.

2020-05-21 Thread Erick Erickson
Please consider _not_ optimizing. It’s kind of a misleading name anyway, and the version of solr you’re using may have unintended consequences, see: https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/ and

Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-21 Thread vishal patel
Any one is looking this issue? I got same issue. Regards, Vishal Patel From: jay harkhani Sent: Wednesday, May 20, 2020 7:39 PM To: solr-user@lucene.apache.org Subject: Query takes more time in Solr 8.5.1 compare to 6.1.0 version Hello, Currently I upgrade

Re: Use cases for the graph streams

2020-05-21 Thread Joel Bernstein
Good question. Let me first point to an interesting example in the Visual Guide to Streaming Expressions and Math Expressions: https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/search-sample.adoc#nodes This example gets to the heart of the core use case for the

RE: How to restore deleted collection from filesystem

2020-05-21 Thread Kommu, Vinodh K.
Thanks Eric for quick response. Yes, our VMs are equipped with NetBackup which is like file based backup and it can restore any files or directories that were deleted from latest available full backup. Can we create an empty collection with the same name which was deleted with same number of

Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-21 Thread jay harkhani
Hello, Please refer below details. >Did you create Solrconfig.xml for the collection from scratch after upgrading >and reindexing? Yes, We have created collection from scratch and also re-indexing. >Was it based on the latest template? Yes, It was as per latest template. >What happens if you

Does Solr master/slave support shard split

2020-05-21 Thread Pushkar Raste
Hi, Does Solr support shard split in the master/slave setup. I understand that there is no shard concept is master/slave and we just have cores but can we split a core into two. If yes is there way to specify new mapping based on the unique key. -- — Pushkar Raste