Re: Getting SOE after a clean import and restart on Solr 7.2.1

2018-02-13 Thread Mohsen Saboorian
Any idea? Can someone direct me where the error come from? As I see, Noggit is a streaming JSON parser, but where is the JSON data? I can only see three tiny files in my core which are in JSON fromat. All of them are valid: 1. _schema_analysis_stopwords_english {

Re: docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread mganeshs
Hi, Thanks for clearing. But as per this link (Enabling DocValues) it says that it supports strField and UUID field also. Again, what you mean by it's not free for large segments. Can you point me to some

RE: lat/long (location ) field context filters for autosuggestions

2018-02-13 Thread Deepak Udapudi
Thanks Emir for the response. Please have a look at the URL below. https://lucene.apache.org/solr/guide/6_6/suggester.html We used the above URL to get the suggestions based on the context field. We used zip code as the context field to fetch the dentists in that particular zip code. It is

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Howe, David
I have set docValues=false on all of the string fields in our index that have indexed=false and stored=true. This gave a small improvement in the index size from 13.3GB to 12.82GB. I have also tried running an optimize, which then reduced the index to 12.6GB. Next step is to dump the sizes

Re: Judging the MoreLikeThis results for relevancy

2018-02-13 Thread Arnold Bronley
Thanks for the reply, Alessandro. Can you please elaborate on a point "a document which has a score 50% of the original doc score, it doesn't mean it is 50% similar"? I did not understand this for two reasons: 1. In the end, we are calculating similarity score between documents when we are

Re: Solr - Managed Resources REST API to get stopwords

2018-02-13 Thread Steve Rowe
Hi, Have you added any stopwords? As mentioned in the ref guide , the techproducts configset includes a pre-defined list of stopwords, but your configset may not have the managed stopwords storage file. --

Re: SolrCloud: How best to do backups?

2018-02-13 Thread Kelly, Frank
Sorry - just got back to this 1. We can standup the AWS resources quickly (~ 30 mins) but the process of repopulating the index is very slow (< 1k docs per second). We need to fix this but I¹m hoping to a backup solution would be a mitigation in the meantime. 2. Yes we have autoscaling (with a

Solr - Managed Resources REST API to get stopwords

2018-02-13 Thread ruby
I added following to my Solr schema: and then restarted Solr. Should following query return all stopwords? http://localhost/solr/collection/schema/analysis/stopwords/english I don't get

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Howe, David
Thanks Hoss. I will try setting docValues to false, as we only ever want to be able to retrieve the value of this field. Regards, David David Howe Java Domain Architect Postal Systems Level 16, 111 Bourke Street Melbourne VIC 3000 T 0391067904 M 0424036591 E david.h...@auspost.com.au

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Howe, David
Hi Erick, Thanks for responding. You are correct that we don't have any deleted docs. When we want to re-index (once a fortnight), we build a brand new installation of Solr from scratch and re-import the new data into an empty index. I will try setting docValues to false and see if that

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Howe, David
Hi Alessandro, The docker image is like a disk image of the entire server, so it includes the operating system, the Solr installation and the data. Because we run in the cloud and our index isn't that big, this is an easy and fast way for us to scale our Solr cluster without having to

Re: Using Synonyms as a feature with LTR

2018-02-13 Thread Roopa Rao
Thank you, Alessandro, I was trying these options before replying. Yes, I am looking to generate a score for a query with synonym expansion (not binary feature) I can go with the "title" field and have that include the synonyms in analysis. Only problem is that the number of fields and number

Re: Solr streaming expression - options for Full Outer Join

2018-02-13 Thread Ganesh Sethuraman
Also want to add, i am trying to do this on Solr 7.2.1 On Tue, Feb 13, 2018 at 1:38 PM, Ganesh Sethuraman wrote: > > I would to perform full outer join (emit documents from both left and > right and if there are common combine them) with solr streaming decorators > on

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread David Hastings
To piggy back on this, what would be the right scenarios to use docvalues='true'? On Tue, Feb 13, 2018 at 1:10 PM, Chris Hostetter wrote: > > : We are using Solr 7.1.0 to index a database of addresses. We have found > : that our index size increases massively when we

Re: Solr streaming expression - options for Full Outer Join

2018-02-13 Thread Ganesh Sethuraman
one typo in the above streaming expression sort, it is "id asc" in the collection col1 On Tue, Feb 13, 2018 at 1:38 PM, Ganesh Sethuraman wrote: > > I would to perform full outer join (emit documents from both left and > right and if there are common combine them) with

Solr streaming expression - options for Full Outer Join

2018-02-13 Thread Ganesh Sethuraman
I would to perform full outer join (emit documents from both left and right and if there are common combine them) with solr streaming decorators on two collections and "update" it to a new destination collection. I see merge decorator option exists, but this seems to return two JSON document for

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Chris Hostetter
: We are using Solr 7.1.0 to index a database of addresses. We have found : that our index size increases massively when we add one extra field to : the index, even though that field is stored and not indexed, and doesn’t what about docValues? : When we run an index load without the

Replicas: sending query to leader and replica simultaneously

2018-02-13 Thread SOLR4189
Hi all, I use SOLR-6.5.1 and I want to start to use replicas in SolrCloud mode. I read ref guide and Solr in Action, and I want to make sure only one thing about REPLICAS: SOLR can't send query both to leader and to slave simultaneously and returns the fastest response of them? (in the case

Re: Solr search word NOT followed by another word

2018-02-13 Thread Emir Arnautović
Hi Ivan, Which version of Solr do you use? I’ve just tried it on 6.5.1 and it returned expected. Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 13 Feb 2018, at 16:08, ivan

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Erick Erickson
David: Right, Optimize Is Evil. Well, actually in your case it's not. In your specific case you can optimize every time you build your index and be OK, gory details here: https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ But that's just for background. The key

Re: High CPU and Physical Memory Usage in solr with 4000 user load

2018-02-13 Thread rubi.hali
Hi Shawn As asked, have attached the gc log and snapshot of top command TopCommandSlave1.jpg and regarding blocked threads, we are fetching facets and doing grouping with the main query and for those fields docValues were

Re: facet.method=uif not working in solr cloud?

2018-02-13 Thread Yonik Seeley
Great, thanks for tracking that down! It's interesting that a mincount of 0 disables uif processing in the first place. IIRC, it's only the hash-based method (as opposed to array-based) that can't return zero counts. -Yonik On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti

Re: Solr search word NOT followed by another word

2018-02-13 Thread ivan
Hi Emir, unfortunately that does not work, since i'm not getting a match for my third example ("Leonardo is the name of Leonardo da Vinci") because i have both "Leonardo" and "Leonardo da Vinci" in the same field. I'm fine with having "Leonardo da Vinci" as long as i have another "Leonardo" (NOT

Re: docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread Emir Arnautović
Hi, It is clearer now, but you mentioned strings in your first mail and in place updates only work for numeric fields. If you meet all conditions, document will not be reindexed, but only doc values rewritten for the segment where in place update happened. Note that this is not free for large

Re: docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread mganeshs
Hi, I guess my point is not conceived correctly. Here I am talking about the field "In Place Updates " As per above link, it says that complete document will not be

Re: Solr search word NOT followed by another word

2018-02-13 Thread Emir Arnautović
Hi Ivan, You might be able to use complexphrase query parser to get what you need, you can test something like this: {!complexphrase df=my_field}”Leonardo -(da Vinci)” This should return any Leonardo that is not followed by da Vinci. HTH, Emir -- Monitoring - Log Management - Alerting -

Getting SOE after a clean import and restart on Solr 7.2.1

2018-02-13 Thread Mohsen Saboorian
After importing 3 million records from DB into Solr 7.2.1 on a CentOS 7, OpenJDK 8, I just restarted Solr and the core (mycore) cannot instantiate. data/index is 36GB and tlog folder contains a file named tlog.0001862 which is about 20MB. Here is the full log: INFO (main) [ ]

Re: docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread Emir Arnautović
Whenever you send doc to indexing, it is indexed completely and old document with the same id (if one exists) is just flagged as deleted and will be removed from index when segment that it is stored is merged. In case of large segments, it might be never. The safest option is to do full

RE: Solr search word NOT followed by another word

2018-02-13 Thread ivan
That looks great! Not sure how to install that into my version of Solr though (using 6.4.1) -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Alessandro Benedetti
Hi David, given the fact that you are actually building a new index from scratch, my shot in the dark didn't hit any target. When you say : "Once the import finishes we save the docker image in the AWS docker repository. We then build our cluster using that image as the base" Do you mean just

Re: Multiple context fields in suggester component

2018-02-13 Thread Alessandro Benedetti
Simple answer is No. Only one context field is supported out of the box. The query you provide as context filtering query ( suggest.cfq= ) is going to be parsed and a boolean query for the context field is created [1]. You will need some customizations if you are targeting that behavior. [1]

Re: facet.method=uif not working in solr cloud?

2018-02-13 Thread Alessandro Benedetti
*Update* : This has been actually already solved by Hoss. https://issues.apache.org/jira/browse/SOLR-11711 and this is the Pull Request : https://github.com/apache/lucene-solr/pull/279/files This should go live with 7.3 Cheers - --- Alessandro Benedetti Search Consultant, R

RE: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Howe, David
Hi Alessanro, Thanks for responding. We rebuild the index every time starting from a fresh installation of Solr. Because we are running at AWS, we have automated our deployment so we start with the base docker image, configure Solr and then import our data every time the data changes (it

Re: docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread mganeshs
Hi, Thanks for quick response. I forgot to mention that after adding it, I have re-indexed all the data with dynamic fields Field_one, Field_two etc. In that case, by adding new field ( docvalue field ) or removing existing docvalue field, Will the whole document will re-indexed again, or

Re: facet.method=uif not working in solr cloud?

2018-02-13 Thread Alessandro Benedetti
+1 I believe it is a bug related to that patch in some way. facet.distrib.mco ( the naming is not very explicit) should activate the feature in the patch, which forces the mincount in the distributed requests to be set to 1. The normal behavior expected is that you pass to the distributed

Re: docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread Emir Arnautović
Hi, Changing schema will not do anything by itself. After changes are applied (core reloaded if not used API to update schema) it will use new schema to index new documents. What matters is what you had in index before schema updates. So if you had defined Field_one as string or you had it as

Re: Solr node is out of sync (looks Healthy)

2018-02-13 Thread Emir Arnautović
Hi Daniel, Back to your original question. What is the diff between doc number on replicas - a few docs or large number? My assumption is that you don’t have autocommit enabled and that you commit explicitly when indexing is done, and somehow on some replica(s) commit is processed before all

docvalues set to true, and indexed is false and stored is set to false

2018-02-13 Thread mganeshs
Hi, If I have set following in the schema What will be the impact of deleting a single field, "Fields_one" field or what's the impact of adding a new field "Fields_100" ? Will the whole document will re-indexed again, or only this field alone will be deleted and added correspondingly. Idea

Re: solr spell check index dictionary build failed issue

2018-02-13 Thread Alessandro Benedetti
Shooting in the dark it seems that 2 processes are trying to write the same disk directory. Is this directory shared by different Solr cores or Solr instances ? If you contribute the configuration from the solrconfig we may be able to help. - --- Alessandro Benedetti Search

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread Alessandro Benedetti
I assume you re-index in full right ? My shot in the dark is that this increment is temporary. You re-index, so effectively delete and add all documents ( this means that even if the new field is just stored, you re-build the entire index for all the fields). Create new segments and the old docs

Re: "editorialMarkerFieldName"

2018-02-13 Thread Emir Arnautović
Hi Chris, Guessing here but this feature might be introduced in case you have use (have to use) ‘elevated’ field in your schema for some other purpose. Maybe if field name was “_elevated_", then it would be redundant. Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr &

Re: Solr node is out of sync (looks Healthy)

2018-02-13 Thread Daniel Carrasco
Hello, I answer inline ;) 2018-02-12 23:56 GMT+01:00 Emir Arnautović : > Hi Daniel, > Please see inline comments. > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > >

Multiple context fields in suggester component

2018-02-13 Thread Renuka Srishti
Hello All, Is there any way to set multiple context fields in suggester component? Or is there any way to apply multiple filters with suggester component in solr? Thanks Renuka Srishti