Re: Three questions about huge tlog problem and CDCR

2019-12-18 Thread alwaysbluesky
found a typo. correcting "updateLogSynchronizer" is set to 6(1 min), not 1 hour -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Three questions about huge tlog problem and CDCR

2019-12-18 Thread Louis
* Environment: Solr Cloud 7.7.0, 3 nodes / CDCR bidirectional / CDCR buffer disabled Hello All, I have some problem with tlog. They are getting bigger and bigger... They don't seem to be deleted at all even after hard commit, so now the total size of tlog files is more than 21GB.. Actually I

Re: how to exclude path from being queried

2019-12-18 Thread Shawn Heisey
On 12/18/2019 1:21 PM, Nan Yu wrote:     I am trying to find all files containing a keyword in a directory (and many sub-directories).     I did a quick indexing using bin/post -c myCore /RootDir     When I query the index using "keyword", all files whose path containing the keyword

Re: how to exclude path from being queried

2019-12-18 Thread Paras Lehana
Hi Nan, Are you using PathHierarchyTokenizer ? On Thu, 19 Dec 2019 at 01:51, Nan Yu wrote: > Hi, > I am trying to find all files containing a keyword in a directory (and > many sub-directories). >

Re: Synonym expansions w/ phrase slop exhausting memory after upgrading to SOLR 7

2019-12-18 Thread Nick D
Michael, Thank you so much, that was extremely helpful. My googlefu wasn't good enough I guess. 1. Was my initial fix just to stop it from exploding. 2. Will be the perm solutions for now until we can get some things squared away for 8.0. Sounds like even in 8 there is a problem with any graph

Re: Starting Solr automatically

2019-12-18 Thread Shawn Heisey
On 12/16/2019 9:48 PM, Anuj Bhargava wrote: Often solr stops working. We have to then go to the root directory and give the command *'service solr start*' Is there a way to automatically start solr when it stops. If Solr is stopping, then something went wrong. Something that will probably

how to exclude path from being queried

2019-12-18 Thread Nan Yu
Hi,      I am trying to find all files containing a keyword in a directory (and many sub-directories).         I did a quick indexing using  bin/post -c myCore /RootDir     When I query the index using "keyword", all files whose path containing the keyword will be included in the search

Re: CVE-2017-7525 fix for Solr 7.7.x

2019-12-18 Thread Kevin Risden
There are no specific plans for any 7.x branch releases that I'm aware of. Specifically for SOLR-13110, that required upgrading Hadoop 2.x to 3.x for specifically jackson-mapper-asl and there are no plans to backport that to 7.x even if there was a future 7.x release. Kevin Risden On Wed, Dec

Re: number of files indexed (re-formatted)

2019-12-18 Thread Erick Erickson
I’d urge you to consider moving the process from using ExtractingRequestHandler (i.e. just sending the data to Solr) to doing the Tika parser externally. ExtractingRequestHandler is a great way to get started, but I’ve often found that I need much finer control over the process. Here’s the

Re: number of files indexed (re-formatted)

2019-12-18 Thread Jörn Franke
This depends on your ingestion process. Usually the unique ids that are not filenames may come not from a file or your ingestion process does not tel the file name. In this case the Collection seems to be configured to generate a unique identifier. Maybe you can describe more in detail on how

Re: Synonym expansions w/ phrase slop exhausting memory after upgrading to SOLR 7

2019-12-18 Thread Michael Gibney
This is related to this issue: https://issues.apache.org/jira/browse/SOLR-13336 Also tangentially relevant: https://issues.apache.org/jira/browse/LUCENE-8531 https://issues.apache.org/jira/browse/SOLR-12243 I think your options include: 1. setting slop=0, which restores SpanNearQuery as the

number of files indexed (re-formatted)

2019-12-18 Thread Nan Yu
Sorry that I just found out that the mailing list takes plain text and my previous post looks really messy. So I reformatted it. Hi,     I did a simple indexing of a directory that contains a lot of pdf, text, doc, zip etc. There are no structures for the content of the files and I would like

number of files indexed

2019-12-18 Thread Nan Yu
Hi,    I did a simple indexing of a directory that contains a lot of pdf, text, doc, zip etc. There are no structures for the content of the files and I would like to index them and later on search "key words" within the files.     After creating the core, I indexed the files in the directory

Move SOLR from cloudera HDFS to SOLR on Docker

2019-12-18 Thread Wael Kader
Hello, I want to move data from my SOLR setup on Cloudera Hadoop to a docker SOLR container. I don't need to run all the hadoop services in my setup as I am only currently using SOLR from the cloudera HDP. My concern now is to know what's the best way to move the data and schema to Docker

CVE-2017-7525 fix for Solr 7.7.x

2019-12-18 Thread Mehai, Lotfi
Hello; We are using Solr 7.7.0. The CVE-2017-7525 have been fixed for Solr 8.x. https://issues.apache.org/jira/browse/SOLR-13110 When the fix will be available for Solr 7.7.x Lotfi