Re: Searching for credit card numbers

2020-07-28 Thread lstusr 5u93n4
; does the regex and adds a field has_credit_card_number:true. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Jul 28, 2020, at 11:50 AM, lstusr 5u93n4 wrote: > > > > Let's say I have a text field that's been

Searching for credit card numbers

2020-07-28 Thread lstusr 5u93n4
Let's say I have a text field that's been indexed with the standard tokenizer, and I want to match the docs that have credit card numbers in them (this is for altruistic purposes, not nefarious ones!). What's the best way to build a search that will do this? Searching for " "

solrj - get metrics from all nodes

2020-06-03 Thread lstusr 5u93n4
Hi All, I'm attempting to connect to the metrics api in solrj to query metrics from my cluster. Using the CloudSolrClient, I get routed to one node, and get metrics only from that node. I'm building my request like this: GenericSolrRequest req = new GenericSolrRequest(METHOD.GET,

Re: Apache Solr 8.4.1 Basic Authentication

2020-03-26 Thread lstusr 5u93n4
Hey Emmanuel, If you're using Java, I'd highly suggest using solrj, it'll do the work that you need it to do: SolrRequest req ;//create a new request object req.setBasicAuthCredentials(userName, password); solrClient.request(req); If that doesn't work for you for some reason, you need to

Re: Upgrading Solrcloud indexes from 7.2 to 8.4.1

2020-03-06 Thread lstusr 5u93n4
Hi Webster, When we upgraded from 7.5 to 8.1 we ran into a very strange issue: https://lucene.472066.n3.nabble.com/Stored-field-values-don-t-update-after-7-gt-8-upgrade-td4442934.html We ended up having to do a full re-index to solve this issue, but if you're going to do this upgrade I would

heavy reads from disk when off-heap ram is constrained

2020-02-27 Thread lstusr 5u93n4
Hi All, Something we learned recently that might be useful to the community. We're running solr in docker, and we've constrained each of our containers to have access to 10G of the host's ram. Also, through `docker stats`, we can see the Block IO (filesystem reads/writes) that the solr process

Re: Adding replica to a shard with only down replicas

2020-02-14 Thread lstusr 5u93n4
Actually I should clarify: we stop solr on one of the nodes, wait for the other node to become the leader, and then start solr back up on the one that was stopped. On Fri, 14 Feb 2020 at 09:41, lstusr 5u93n4 wrote: > We've seen this type of deadlock pretty often. Our recourse is to rest

Re: Adding replica to a shard with only down replicas

2020-02-14 Thread lstusr 5u93n4
We've seen this type of deadlock pretty often. Our recourse is to restart solr on only one of the nodes, this seems to force the leader election to take place and it soon stars rebuilding. Let me know if you try that and it works... Wouldn't mind another validation point that this happens to

Re: Solr on HDFS

2019-08-02 Thread lstusr 5u93n4
Hi Joe, We fought with Solr on HDFS for quite some time, and faced similar issues as you're seeing. (See this thread, for example:" http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201812.mbox/%3cCABd9LjTeacXpy3FFjFBkzMq6vhgu7Ptyh96+w-KC2p=-rqk...@mail.gmail.com%3e ) The Solr lock

Re: Stored field values don't update after 7 -> 8 upgrade

2019-07-09 Thread lstusr 5u93n4
, 5 Jul 2019 at 15:30, lstusr 5u93n4 wrote: > Hi All, > > We have a collection that was created on Solr 7.5, and then Solr was > upgraded to 8.1 . After the upgrade, we're seeing that the stored values of > the fields of documents that existed before the upgrade aren't being stored &g

Stored field values don't update after 7 -> 8 upgrade

2019-07-05 Thread lstusr 5u93n4
Hi All, We have a collection that was created on Solr 7.5, and then Solr was upgraded to 8.1 . After the upgrade, we're seeing that the stored values of the fields of documents that existed before the upgrade aren't being stored when the record is updated, even though the indexed value is. For

Re: Load balance writes

2019-02-11 Thread lstusr 5u93n4
Hi Boban, First of all: I agree with Walter here. Because the bottleneck is during indexing on the leader, a basic round robin load balancer will perform just as well as a custom solution. With far less headache. A custom solution will be far more work than it's worth. But, should you really

Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread lstusr 5u93n4
We've seen the same thing on solr 7.5 by doing: - create a collection - add some data - stop solr on all servers - delete all contents of the solr node from zookeeper - start solr on all nodes - create a collection with the same name as in the first step When doing this, solr wipes out the

Re: solr reads whole index on startup

2018-12-21 Thread lstusr 5u93n4
an you share the Solr HDFS configuration settings that you tested > with? Blockcache and direct memory size? I'd be curious just as a > reference point. > > Kevin Risden > > On Thu, Dec 20, 2018 at 10:31 AM lstusr 5u93n4 > wrote: > > > > Hi All, > > >

Re: solr reads whole index on startup

2018-12-20 Thread lstusr 5u93n4
ce to the results, but the averages are representative of the times we're seeing. Thanks for reading! Kyle On Mon, 10 Dec 2018 at 14:14, lstusr 5u93n4 wrote: > Hi Guys, > > > What OS is it on? > CentOS 7 > > > With your indexes in HDFS, the HD

Re: solr reads whole index on startup

2018-12-10 Thread lstusr 5u93n4
Hi Guys, > What OS is it on? CentOS 7 > With your indexes in HDFS, the HDFS software running > inside Solr also needs heap memory to operate, and is probably going to > set aside part of the heap for caching purposes. We still have the solr.hdfs.blockcache.slab.count parameter set to the

Re: solr reads whole index on startup

2018-12-06 Thread lstusr 5u93n4
lot of Solr installations, and never seen one > be stable with that much disparity between index size and available > RAM so part of this is "the voice of experience". Whether that > experience is accurate or not is certainly debatable. > > Best, > Erick > On Wed, Dec

Re: solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
the surface to be an extremely undersized > system, and unless and until you properly size it you'll have problems > > Best, > Erick > On Wed, Dec 5, 2018 at 10:12 AM lstusr 5u93n4 wrote: > > > > Hi Kevin, > > > > We do have logs. Grepping for pe

Re: solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
locate the core, because it's there in ${SOLR_HOME} and also exists on hdfs... Thanks! Kyle On Wed, 5 Dec 2018 at 13:12, lstusr 5u93n4 wrote: > Hi Kevin, > > We do have logs. Grepping for peersync, I can see > > solr | 2018-12-05 03:31:41.301 INFO > (coreZkR

Re: solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
ore solr has been launched on the other servers. Kyle On Wed, 5 Dec 2018 at 12:58, Kevin Risden wrote: > Do you have logs right before the following? > > "we notice that the nodes go into "Recovering" state for about 10-12 hours > before finally coming alive." > >

solr reads whole index on startup

2018-12-05 Thread lstusr 5u93n4
Hi All, We have a collection: - solr 7.5 - 3 shards, replication factor 2 for a total of 6 NRT replicas - 3 servers, 16GB ram each - 2 billion documents - autoAddReplicas: false - 2.1 TB on-disk index size - index stored on hdfs on separate servers. If we (gracefully) shut down

Re: solrj - Batching and Optimistic Concurrency

2018-12-04 Thread lstusr 5u93n4
option (assuming you're not > > assigning _version_ yourself)? > > > > Best, > > Erick > > On Mon, Dec 3, 2018 at 11:57 AM lstusr 5u93n4 > wrote: > > > > > > Hi All, > > > > > > I have a scenario where I'm trying to enable batching on

solrj - Batching and Optimistic Concurrency

2018-12-03 Thread lstusr 5u93n4
Hi All, I have a scenario where I'm trying to enable batching on the solrj client, but trying to see how that works with Optimistic Concurrency. >From what I can tell, if I pass a list of SolrInputDocument to my solr client, and a document somewhere in that list contains a `_version_` field that

Re: PathHierarchyTokenizerFactory single level match

2018-11-23 Thread lstusr 5u93n4
ount" set to 3. Now your secondary search becomes > "q=whatever=category:Books/NonFic=level_count:2". > > Best, > Erick > On Fri, Nov 23, 2018 at 6:24 AM lstusr 5u93n4 wrote: > > > > Hi, > > > > I have a sche

PathHierarchyTokenizerFactory single level match

2018-11-23 Thread lstusr 5u93n4
Hi, I have a schema that has a descendent_path field as configured in the PathTokenizerHierarchyFactory docs: Using the example in the docs: *For example, in the configuration below a query for Books/NonFic will match documents indexed with values like

Re: solr cloud - hdfs folder structure best practice

2018-11-02 Thread lstusr 5u93n4
e are issues with autoAddReplicas or other types of failovers > if there are different home folders. > > I've run Solr on HDFS with the same basic configs as listed here: > > https://risdenk.github.io/2018/10/23/apache-solr-running-on-apache-hadoop-hdfs.html > > Kevin Risden > > > On

solr cloud - hdfs folder structure best practice

2018-11-02 Thread lstusr 5u93n4
Hi All, Here's a question that I can't find an answer to in the documentation: When configuring solr cloud with HDFS, is it best to: a) provide a unique hdfs folder for each solr cloud instance or b) provide the same hdfs folder to all solr cloud instances. So for example, if I have two

Re: Solr cloud - poweroff procedure

2018-10-31 Thread lstusr 5u93n4
-replicas.html#using-cluster-property-to-enable-autoaddreplicas > > On Wed, Oct 31, 2018 at 3:27 AM lstusr 5u93n4 wrote: > > > Hi All, > > > > We have a solr cloud running 3 shards, 3 hosts, 6 total NRT replicas, and > > the data director on hdfs. It has 950 mill

Solr cloud - poweroff procedure

2018-10-30 Thread lstusr 5u93n4
Hi All, We have a solr cloud running 3 shards, 3 hosts, 6 total NRT replicas, and the data director on hdfs. It has 950 million documents in the index, occupying 700GB of disk space. We need to completely power off the system to move it. Are there any actions we should take on shutdown to help