Re: Solr training

2020-09-17 Thread matthew sporleder
Is there a friends-on-the-mailing list discount? I had a bit of sticker shock! On Wed, Sep 16, 2020 at 9:38 AM Charlie Hull wrote: > > I do of course mean 'Group Discounts': you don't get a discount for > being in a 'froup' sadly (I wasn't even aware that was a thing!) > > Charlie > > On

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rajdeep Sahoo
But not sure why these type of search string is causing high cpu utilization. On Fri, 18 Sep, 2020, 12:49 am Rahul Goswami, wrote: > Is this for a phrase search? If yes then the position of the token would > matter too and not sure which token would you want to remove. "eg > "tshirt hat

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rahul Goswami
Is this for a phrase search? If yes then the position of the token would matter too and not sure which token would you want to remove. "eg "tshirt hat tshirt". Also, are you looking to save space and want this at index time? Or just want to remove duplicates from the search string? If this is at

RE: Need to update SOLR_HOME in the solr service script and getting errors

2020-09-17 Thread Victor Kretzer
Hi Mark. Thanks for taking the time to explain it so clearly. It makes perfect sense to me now and using chown solved the problem. Thanks again and have a great day. Victor -Original Message- From: Mark H. Wood Sent: Thursday, September 17, 2020 9:59 AM To:

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Alexandre Rafalovitch
The missing underscore is a documentation bug, because it was not escaped the second time and the asciidoc chewed it up as an bold/italic indicator. The declaration and references should match. I am not sure about the code. I hope somebody else will step in on that part. Regards, Alex. On

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rajdeep Sahoo
If someone is searching with " tshirt tshirt tshirt tshirt tshirt tshirt" we need to remove the duplicates and search with tshirt. On Fri, 18 Sep, 2020, 12:19 am Alexandre Rafalovitch, wrote: > This is not quite enough information. > There is >

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Alexandre Rafalovitch
This is not quite enough information. There is https://lucene.apache.org/solr/guide/8_6/filter-descriptions.html#remove-duplicates-token-filter but it has specific limitations. What is the problem that you are trying to solve that you feel is due to duplicate tokens? Why are they duplicates? Is

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Pratik Patel
I am running this in a unit test which deletes the collection after the test is over. So every new test run gets a fresh collection. It is a very simple test where I am first indexing a couple of parent documents with few children and then testing an atomic update on one parent as I have posted

How to remove duplicate tokens from solr

2020-09-17 Thread Rajdeep Sahoo
Hi team, Is there any way to remove duplicate tokens from solr. Is there any filter for this.

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
Thank you all for your feedback. They are very helpful. @Walther, out of the 1000 fields in Solr's schema, only 5 are set as "required" fields and the Solr doc that I create and then send to Solr for indexing, contains only those fields that have data to be indexed. So some docs will have 10

Re: Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Erick Erickson
I recommend _against_ issuing explicit commits from the client, let your solrconfig.xml autocommit settings take care of it. Make sure either your soft or hard commits open a new searcher for the docs to be searchable. I’ll bend a little bit if you can _guarantee_ that you only ever have one

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
The script can actually be written an any number of scripting languages, python, groovy, javascript etc. but Alexandre’s comments about javascript are well taken. It all depends here on whether you every want to search the fields individually. If you do, you need to have them in your index as

Re: Doing what does using SolrJ API

2020-09-17 Thread Walter Underwood
If you want to ignore a field being sent to Solr, you can set indexed=false and stored=false for that field in schema.xml. It will take up room in schema.xml but zero room on disk. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Sep 17, 2020, at

Re: Doing what does using SolrJ API

2020-09-17 Thread Alexandre Rafalovitch
Solr has a whole pipeline that you can run during document ingesting before the actual indexing happens. It is called Update Request Processor (URP) and is defined in solrconfig.xml or in an override file. Obviously, since you are indexing from SolrJ client, you have even more flexibility, but it

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Alexandre Rafalovitch
Did you reindex the original document after you added a new field? If not, then the previously indexed content is missing it and your code paths will get out of sync. Regards, Alex. P.s. I haven't done what you are doing before, so there may be something I am missing myself. On Thu, 17 Sep

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
Thanks Erick. Where can I learn more about "stateless script update processor factory". I don't know what you mean by this. Steven On Thu, Sep 17, 2020 at 1:08 PM Erick Erickson wrote: > 1000 fields is fine, you'll waste some cycles on bookkeeping, but I really > doubt you'll notice. That

Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Steven White
Hi everyone, I'm trying to figure out when and how I should handle failures that may occur during indexing. In the sample code below, look at my comment and let me know what state my index is in when things fail: SolrClient solrClient = new HttpSolrClient.Builder(url).build();

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
1000 fields is fine, you'll waste some cycles on bookkeeping, but I really doubt you'll notice. That said, are these fields used for searching? Because you do have control over what gous into the index if you can put a "stateless script update processor factory" in your update chain. There you can

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
Hi Eric, Yes, this is coming from a DB. Unfortunately I have no control over the list of fields. Out of the 1000 fields that there maybe, no document, that gets indexed into Solr will use more then about 50 and since i'm copying the values of those fields to the catch-all field and the

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Pratik Patel
Thanks for your reply Alexandre. I have "_root_" and "_nest_path_" fields in my schema but not "_nest_parent_". I ran my test after adding the "_nest_parent_" field and I am not getting NPE any more which is good. Thanks! But looking at the documents in the index, I see that after the

Re: Help using Noggit for streaming JSON data

2020-09-17 Thread Yonik Seeley
See this method: /** Reads a JSON string into the output, decoding any escaped characters. */ public void getString(CharArr output) throws IOException And then the idea is to create a subclass of CharArr to incrementally handle the string that is written to it. You could overload write

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread Alexandre Rafalovitch
Can you double-check your schema to see if you have all the fields required to support nested documents. You are supposed to get away with just _root_, but really you should also include _nest_path and _nest_parent_. Your particular exception seems to be triggering something (maybe a bug) related

Help using Noggit for streaming JSON data

2020-09-17 Thread Christopher Schultz
All, Is this an appropriate forum for asking questions about how to use Noggit? The Github doesn't have any discussions available and filing an "issue" to ask a question is kinda silly. I'm happy to be redirected to the right place if this isn't appropriate. I've been able to figure out most

Re: NPE Issue with atomic update to nested document or child document through SolrJ

2020-09-17 Thread pratik@semandex
Following are the approaches I have tried so far and both results in NPE. *approach 1 TestChildPOJO testChildPOJO = new TestChildPOJO().cId( "c1_child1" ) .conceptid( "c1" )

Re: Need to update SOLR_HOME in the solr service script and getting errors

2020-09-17 Thread Mark H. Wood
On Wed, Sep 16, 2020 at 02:59:32PM +, Victor Kretzer wrote: > My setup is two solr nodes running on separate Azure Ubuntu 18.04 LTS vms > using an external zookeeper assembly. > I installed Solr 6.6.6 using the install file and then followed the steps for > enabling ssl. I am able to start

Re: Unable to create core Solr 8.6.2

2020-09-17 Thread Erick Erickson
Look in your solr log, there’s usually a more detailed message > On Sep 17, 2020, at 9:35 AM, Anuj Bhargava wrote: > > Getting the following error message while trying to create core > > # sudo su - solr -c "/opt/solr/bin/solr create_core -c 9lives" > WARNING: Using _default configset with

Unable to create core Solr 8.6.2

2020-09-17 Thread Anuj Bhargava
Getting the following error message while trying to create core # sudo su - solr -c "/opt/solr/bin/solr create_core -c 9lives" WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use. To turn off: bin/solr config -c 9lives -p 8984

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
“there over 1000 of them[fields]” This is often a red flag in my experience. Solr will handle that many fields, I’ve seen many more. But this is often a result of “database thinking”, i.e. your mental model of how all this data is from a DB perspective rather than a search perspective. It’s

Re: Fetched but not Added Solr 8.6.2

2020-09-17 Thread Anuj Bhargava
SolrWriter Error creating document : On Thu, 17 Sep 2020 at 15:53, Jörn Franke wrote: > Log file will tell you the issue. > > > Am 17.09.2020 um 10:54 schrieb Anuj Bhargava : > > > > We just installed Solr 8.6.2 > > It is fetching the data but not adding > > > > Indexing completed.

Re: Fetched but not Added Solr 8.6.2

2020-09-17 Thread Jörn Franke
Log file will tell you the issue. > Am 17.09.2020 um 10:54 schrieb Anuj Bhargava : > > We just installed Solr 8.6.2 > It is fetching the data but not adding > > Indexing completed. *Added/Updated: 0 *documents. Deleted 0 documents. > (Duration: 06s) > Requests: 1 ,* Fetched: 100* 17/s,

Fetched but not Added Solr 8.6.2

2020-09-17 Thread Anuj Bhargava
We just installed Solr 8.6.2 It is fetching the data but not adding Indexing completed. *Added/Updated: 0 *documents. Deleted 0 documents. (Duration: 06s) Requests: 1 ,* Fetched: 100* 17/s, Skipped: 0 , Processed: 0 The *data-config.xml*