Help with Stream Graph

2019-10-16 Thread Rajeswari Natarajan
Hi, Since the stream graph query for my use case , didn't work as i took the data from solr source code test and also copied the schema and solrconfig.xml from solr 7.6 source code. Had to substitute few variables. Posted below data curl -X POST http://localhost:8983/solr/knr/update -H

Solr JVM Turning - 7.2.1

2019-10-16 Thread Sethuraman, Ganesh
Hi, We are using Solr 7.2.1 with 2 nodes (245GB RAM each) and 3 node ZK cluster in production. We are using Java 8 with default GC settings (with NewRatio=3) with 15GB max heap, changed to 16 GB after the performance issue mentioned below. We have about 90 collections in this (~8 shards

Re: Solr JVM performance challenge with Updates

2019-10-16 Thread GaneshSe
Any help on this is much appreciated. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: Highlighting Solr 8

2019-10-16 Thread Eric Allen
Thanks for the reply. Currently we are migrating from solr4 to solr8 under solr 4 we wrote our own highlighter because the provided one was too slow for our documents. We deal with many large documents, but we have full term vectors already. So as I understand it from my reading of the code

Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Nicolas Paris
> Note: adding score=none as a local param. Turns another algorithm > dragging by from side join. Indeed, the behavior with score=none local param is a query time correlated with the joined collection subset size. For subset of 100k documenrs, the query time is 1 seconds, 4 sec for 1M I get

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

2019-10-16 Thread Russell Bahr
Hi Shawn, Just checking to see if you saw my reply and had any feedback. Thank you again for your help. It is much appreciated. Thank you, Russ From: Russell Bahr Date: Tuesday, October 15, 2019 at 11:50 AM To: "solr-user@lucene.apache.org" Subject: Re: solr 8.1.1 many time slower returning

Re: Position search

2019-10-16 Thread Tim Casey
Adi, If you are looking for something specific you might want to try something different. Before you would search 'the end of a document', you might think about segmenting the document and searching specific segments. At the end of a lot of things like email will be signatures. Those are

Re: Position search

2019-10-16 Thread Alexandre Rafalovitch
Well, after some digging and trying to recall things: 1) XMLParser allows to specify a query in a different way from normal query parameters: https://lucene.apache.org/solr/guide/8_1/other-parsers.html#xml-query-parser 2) SpanFirst allowed to anchor the search to the start of the text and provide

Re: Query on autoGeneratePhraseQueries

2019-10-16 Thread Michael Gibney
Going to back to the initial question, the wording is a little ambiguous and it occurs to me that it's possible there's a misunderstanding of what autoGeneratePhraseQueries does. It really only auto-generates phrase *subqueries*. To use the example from the initial request, a query like (black

Re: The Visual Guide to Streaming Expressions and Math Expressions

2019-10-16 Thread Joel Bernstein
Hi Pratik, The visualizations are all done using Apache Zeppelin and the Zeppelin-Solr interpreter. The getting started part of the user guide provides links for Zeppelin-Solr. The install process in pretty quick. This is all open source, freely available software. It's possible that Zepplin-Solr

Do backups of collections need to be taken on the Leader?

2019-10-16 Thread Koen De Groote
I'm trying to restore a couple of collections, and 1 keeps feeling. This happens to be the only one who's leader isn't on the host that the backup was taken from. The backup was done on server1, for all collections. For this collection that is failing, the Leader was on server2. All other

Re: The Visual Guide to Streaming Expressions and Math Expressions

2019-10-16 Thread Pratik Patel
Hi Joel, Looks like this is going to be very helpful, thank you! I am wondering whether the visualizations are generated through third party library or is it something which would be part of solr distribution?

The Visual Guide to Streaming Expressions and Math Expressions

2019-10-16 Thread Joel Bernstein
Hi, The Visual Guide to Streaming Expressions and Math Expressions is now complete. It's been published to Github at the following location: https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc#streaming-expressions-and-math-expressions The guide

Re: Query on autoGeneratePhraseQueries

2019-10-16 Thread Shawn Heisey
On 10/16/2019 7:14 AM, Shubham Goswami wrote: I have implemented the sow=false property with eDismax Query parser but still it does not has any effect on the query as it is still parsing as separate terms instead of phrased one. We have seen reports that when sow=false, which is the default

Re: Re: Query on autoGeneratePhraseQueries

2019-10-16 Thread Shubham Goswami
Hi Rohan/Audrey I have implemented the sow=false property with eDismax Query parser but still it does not has any effect on the query as it is still parsing as separate terms instead of phrased one. On Tue, Oct 15, 2019 at 8:25 PM Rohan Kasat wrote: > Also check , > pf , pf2 , pf3 > ps , ps2,

Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Mikhail Khludnev
Note: adding score=none as a local param. Turns another algorithm dragging by from side join. On Wed, Oct 16, 2019 at 11:37 AM Nicolas Paris wrote: > Sadly, the join performances are poor. > The joined collection is 12M documents, and the performances are 6k ms > versus 60ms when I compare to

Re: Need help with Solr Streaming query

2019-10-16 Thread Erick Erickson
The NOT operator isn’t a Boolean NOT, so it requires some care, Chris Hostetter wrote a good blog about that. Try q=*:* -(:*c* The query q=-something really isn’t valid syntax, but some query parsers help you out by silently putting the *:* in front of it. that’s not guaranteed across all

Re: Position search

2019-10-16 Thread Erick Erickson
Three things off the top of my head, in order of how long it’d take to implement: *** If it’s _always_ some distance from the start or end, index special beginning and end tags. perhaps a nonsense string like BEGINslkdjfhsldkfhsdkfh and ENDslakshalskdfhj. Now your searches become phrase

RE: Position search

2019-10-16 Thread Kaminski, Adi
Hi, These are really text positions. For example I have a document: "hello thanks for calling the support how can I help you" And in the application I would like to search for documents that match "thanks" NEAR "support" only in first 30 words of the document (greeting part for example), and

Need help with Solr Streaming query

2019-10-16 Thread Prasenjit Sarkar
Hi, I am facing issue while working with solr streamimg expression. I am using /export for emiting tuples out of streaming query.Howver when I tried to use not operator in solr query it is not working.The same is working with /select. Please find the below query

Re: Query related APACHE SOLR 8.2.0

2019-10-16 Thread sasarun
Hi Rohit, Solr bundle comes with a Jetty server by default and does not require a tomcat instance to run. Even though earlier version of Solr was in the form of war file, Solr 5.0 and higher versions no longer supports user defined containers. Details of the same are available in the link below

Query related APACHE SOLR 8.2.0

2019-10-16 Thread Rohit Rasal
Hello, We are trying to implement APACHE SOLR 8.2.0 in our Organization, In our organization, we use Tomcat for Deployment of web applications and Server OS is Suse Linux (SLES v12-sp3). So we have some Query related to software requirement of APACHE SOLR 8.2.0, 1. Which tomcat minimum

Re: Using Tesseract OCR to extract PDF files in EML file attachment

2019-10-16 Thread Charlie Hull
My colleagues Eric Pugh and Dan Worley covered OCR and Solr in a presentation at our recent London Lucene/Solr Meetup: https://www.meetup.com/Apache-Lucene-Solr-London-User-Group/events/264579498/ (direct link to slides if you can't find it in the comments

Re: Position search

2019-10-16 Thread Alexandre Rafalovitch
So are these really text locations or rather actually sections of the document. If later, can you parse out sections during indexing? Regards, Alex On Wed, Oct 16, 2019, 3:57 AM Kaminski, Adi, wrote: > Hi, > Thanks for the responses. > > It's a soft boundary which is resulted by dynamic

Re: Atomic Updates with PreAnalyzedField

2019-10-16 Thread Oleksandr Drapushko
https://issues.apache.org/jira/browse/SOLR-13850 On Wed, Oct 16, 2019 at 11:25 AM Mikhail Khludnev wrote: > Hello, Oleksandr. > It deserves JIRA, please raise one. > > On Tue, Oct 15, 2019 at 8:17 PM Oleksandr Drapushko > wrote: > > > Hello Community, > > > > I've discovered data loss bug and

Problems with TokenFilter, but only in wildcard queries

2019-10-16 Thread Björn Keil
Hello, I am having a problem with a primitive self-written TokenFilter, namely the GermanUmlautFilter in the example below. It's being used for both queries and indexing. It works perfectly most of the time, it replace ä with ae, ö with oe and so forth, before ICUFoldingFilter replaces the

Re: Highlighting Solr 8

2019-10-16 Thread sasarun
Hi Eric, Unified highlighter does not have an option to provide alternate field when highlighting. That option is available with Orginal and fast vector highlighter. As indicated in the Solr documentation, Unified is the recommended method for highlighting to meet most of the use cases. Please do

Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Nicolas Paris
Sadly, the join performances are poor. The joined collection is 12M documents, and the performances are 6k ms versus 60ms when I compare to the denormalized field. Apparently, the performances does not change when the filter on the joined collection is changed. It is still 6k ms when the subset

Re: Atomic Updates with PreAnalyzedField

2019-10-16 Thread Mikhail Khludnev
Hello, Oleksandr. It deserves JIRA, please raise one. On Tue, Oct 15, 2019 at 8:17 PM Oleksandr Drapushko wrote: > Hello Community, > > I've discovered data loss bug and couldn't find any mention of it. Please > confirm this bug haven't been reported yet. > > > Description: > > If you try to

RE: Position search

2019-10-16 Thread Kaminski, Adi
Hi, Thanks for the responses. It's a soft boundary which is resulted by dynamic syntax from our application. So may vary from different user searches, one user can search some "word1" in starting 30 words, and another can search "word2" in starting 10 words. The use case is to match some

Re: Solr-Cloud, join and collection collocation

2019-10-16 Thread Nicolas Paris
> You can certainly replicate the joined collection to every shard. It > must fit in one shard and a replica of that shard must be co-located > with every replica of the “to” collection. Yes, I found this in the documentation, with a clear example just after this mail. I will test it today. I