Re: Query number of Lucene documents using Solr?

2019-08-27 Thread Bernd Fehling
You might use the Lucene internal CheckIndex included in lucene core. It should tell you everything you need. At least a good starting point for writing your own tool. Copy lucene-core-x.y.z-SNAPSHOT.jar and lucene-misc-x.y.z-SNAPSHOT.jar to a local directory. java -cp

Re: Query number of Lucene documents using Solr?

2019-08-27 Thread Bram Van Dam
On 26/08/2019 23:12, Shawn Heisey wrote: > The numbers shown in Solr's LukeRequestHandler come directly from > Lucene.  This is the URL endpoint it will normally be at, for core XXX: > > http://host:port/solr/XXX/admin/luke Thanks Shawn, that's a great entry point! > The specific error you

RE: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Wittenberg, Lucas
Thanks for the suggestion. But the "customid" field is already set as docValues="true" actually. Well, I guess so as it is a type="string" which by default has docValues="true". -Message d'origine- De : Wittenberg, Lucas Envoyé : lundi 26 août 2019 18:01 À :

Re: Query number of Lucene documents using Solr?

2019-08-27 Thread Erick Erickson
Bram: If you optimize (Solr 7.4 and earlier), that may be part of the “stuff” as an index with a single segment can accumulate far more deleted documents. Shot in the dark. See: https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/ Plus the linked article to how

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Toke Eskildsen
On Mon, 2019-08-26 at 16:01 +, Wittenberg, Lucas wrote: > @Override > public void collect(int docNumber) throws IOException { > if (null != this.reader && > isValid(this.reader.document(docNumber).get("customid"))) > { >

Distributed graph traversal

2019-08-27 Thread Komal Motwani
Hi All, I am looking at ways for doing distributed graph query in solr. The data i am looking at can not fit into single core because of some design constraints neither can we use collections on solrCloud. I could find a tech talk from Kevin Watters on the same and see he has raised issues with

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Erick Erickson
Well, the question is then whether you’re getting the value from the docValues structure or the stored structure. My bet is the latter. Simple test would be to comment out the line and return some random value just to see how long it takes. > On Aug 27, 2019, at 5:05 AM, Wittenberg, Lucas >

Re: Distributed graph traversal

2019-08-27 Thread Erick Erickson
Have you looked at streaming? https://lucene.apache.org/solr/guide/6_6/graph-traversal.html Best, Erick > On Aug 27, 2019, at 6:47 AM, Komal Motwani wrote: > > Hi All, > > I am looking at ways for doing distributed graph query in solr. The data i > am looking at can not fit into single core

Error in Dataimport without reason or log

2019-08-27 Thread Daniel Carrasco
Hello, I write because I'm having problems importing some data from a MariaDB database to my Solr Cloud cluster, and I'm not able to see the data or where's the import problem. My Solr has a dataimport that query a MariaDB database and index the data, but seems to be not working. When the

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Toke Eskildsen
On Tue, 2019-08-27 at 09:05 +, Wittenberg, Lucas wrote: > But the "customid" field is already set as docValues="true" actually. > Well, I guess so as it is a type="string" which by default has > docValues="true". > > required="true" multiValued="false" /> > docValues="true" /> Yeah, it's a

Query-time synonyms without indexing

2019-08-27 Thread Bjarke Buur Mortensen
We have a solr file of type "string". It turns out that we need to do synonym expansion on query time in order to account for some changes over time in the values stored in that field. So we have tried introducing a custom fieldType that applies the synonym filter at query time only (see bottom

RE: Require searching only for file content and not metadata

2019-08-27 Thread Khare, Kushal (MIND)
Basically, what problem I am facing is - I am getting the textual content + other metadata in my _text_ field. But, I want only the textual content written inside the document. I tried various Request Handler Update Extract configurations, but none of them worked for me. Please help me resolve

Re: Require searching only for file content and not metadata

2019-08-27 Thread Yogendra Kumar Soni
It will be easier to parse documents create content, metadata and other required fields yourself in place of using default post tool. You will have better control on what is going to which field. On Tue 27 Aug, 2019, 6:48 PM Khare, Kushal (MIND), < kushal.kh...@mind-infotech.com> wrote: >

Re: Query-time synonyms without indexing

2019-08-27 Thread Bjarke Buur Mortensen
Yes, but isn't that what I am already doing in this case (look at the fieldType in the original mail)? Is there some other way to specify that field type and achieve what I want? Thanks, Bjarke On Tue, Aug 27, 2019, 17:32 Erick Erickson wrote: > You can have separate index and query time

Index fetch failed

2019-08-27 Thread Akreeti Agarwal
Hello Everyone, I am getting this error continuously on Solr slave, can anyone tell me the solution for this: 642141666 ERROR (indexFetcher-72-thread-1) [ x:sitecore_web_index] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Unable to download

Re: Index fetch failed

2019-08-27 Thread Atita Arora
Hii, Do you have enough memory free for the index chunk to be fetched/Downloaded on the slave node? On Wed, Aug 28, 2019 at 6:57 AM Akreeti Agarwal wrote: > Hello Everyone, > > I am getting this error continuously on Solr slave, can anyone tell me the > solution for this: > > 642141666 ERROR

Re: Query-time synonyms without indexing

2019-08-27 Thread Erick Erickson
You can have separate index and query time analysis chains, there are many examples in the stock Solr schemas. Best, Erick > On Aug 27, 2019, at 8:48 AM, Bjarke Buur Mortensen > wrote: > > We have a solr file of type "string". > It turns out that we need to do synonym expansion on query time

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Erick Erickson
> I don't know the precedence rules for stored vs. dovValues in Solr DocValues are used if (and only if) all the fields being returned have docValues=“true” _and_ are single-valued, or if you’ve explicitly set useDocValuesAsStored. single-valued docValues are they only situation where the

RE: Require searching only for file content and not metadata

2019-08-27 Thread Khare, Kushal (MIND)
Chris, What I have done is, I just created a core, used POST tool to index the documents from my file system, and then moved to Solr Admin for querying. For 'Metadata' vs 'Content' , I mean that I just want the field '_text_' to be searched for, instead of all the fields that solr creates by

What are the risk of running into "Unmap hack not supported on this platform"

2019-08-27 Thread Pushkar Raste
Hi, I am trying to run Solr 4 on JDK11, although this version is not supported on JDK11 it seems to be working fine except for the error/exception "Unmap hack not supported on this platform". What the risks/downsides of running into this.