Migrate from sol 5.3.1 to 7.5.0

2019-02-12 Thread ramyogi
We are migrating SOLR version, We used 3 ZK hosts that configured to SOLR as ZK connection string: zookeeper.solrtest.net:2181/test-config Ensemble size: 1 Ensemble mode: ensemble zookeeper.solrtest.net:2181 ok true clientPort 2181 zk_server_state follower zk_version 3.4.5

Re: Java object binding not working

2019-02-12 Thread Swapnil Katkar
Hi, Do you need any input from me to resolve this issue? Regards, Swapnil Katkar On Fri, Feb 8, 2019 at 10:30 AM Swapnil Katkar wrote: > Hi, > > It would be beneficial to me if you provide me at least some hint to > resolve this problem. Thanks in advance! > > Regards, > Swapnil Katkar > > >

SOLR 7.5.0 (Migrate from 5.3.1 to 7.5.0)

2019-02-12 Thread ramyogi
We are migrating solr 5.3 to solr 7.5 after done, 2/12/2019, 11:22:37 AM WARN false x:test_shard20_replica_n38 SolrCore [test_shard20_replica_n38] PERFORMANCE WARNING: Overlapping onDeckSearchers=4 2/12/2019, 11:22:37 AM WARN false x:test_shard20_replica_n38 SolrCore [test_shard20_replica_n38]

Re: Docker and Solr Indexing

2019-02-12 Thread solrnoobie
Oh ok then that must no be the culprit then. I got this logs from our application server but I'm not sure if this is useful: Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.http.ParseException: Invalid content type: at

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Reverted back to 7.6.0 - same settings, but now I do not encounter the large CPU usage. -Joe On 2/12/2019 12:37 PM, Joe Obernberger wrote: Thank you Shawn.  Yes, I used the settings off of your site. I've restarted the cluster and the CPU usage is back up again. Looking at it now, it doesn't

Get details about server-side errors

2019-02-12 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hello, everyone. I'm trying to get some information about a (fairly) simple case when a user is searching using a wide-open query where they can type in anything they want, including field-names. Of course, it's possible that they will try to enter

[SECURITY] CVE-2017-3164 SSRF issue in Apache Solr

2019-02-12 Thread Tomas Fernandez Lobbe
CVE-2017-3164 SSRF issue in Apache Solr Severity: High Vendor: The Apache Software Foundation Versions Affected: Apache Solr versions from 1.3 to 7.6.0 Description: The "shards" parameter does not have a corresponding whitelist mechanism, so it can request any URL. Mitigation: Upgrade to

Re: Solr relevancy score different on replicated nodes

2019-02-12 Thread Aman Tandon
Thanks Erick for your suggestions and time. On Tue, Feb 12, 2019, 22:32 Erick Erickson You really only have four > 1> use exactstats. This won't guarantee precise matches, but they'll be > closer > 2> optimize (not particularly recommended, but if you're willing to do > it periodically it'll

Re: Createsnapshot null pointer exception

2019-02-12 Thread Erick Erickson
You're going to continually run into issues if you use the _cores_ api to add replicas to a _collection_. True, the collection API ADDREPLICA command uses the _cores_ API to add replicas, but someone else has already worked out all the finicky details. So do yourself a favor and use the

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Thank you Shawn.  Yes, I used the settings off of your site. I've restarted the cluster and the CPU usage is back up again. Looking at it now, it doesn't appear to be GC related. Full log from one of the nodes that is pegging 13 CPU cores: http://lovehorsepower.com/solr_gc.log.0.current Thank

Re: Document Score seen in debug section and in main results section dont match

2019-02-12 Thread Baloo
Thanks Erick, We will stick to Solr 7.2.1 which works fine with multiple boost queries. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: What's the deal with dataimporthandler overwriting indexes?

2019-02-12 Thread Elizabeth Haubert
I've run into this also; it is a key difference between a master-slave setup and a solrCloud setup. clean=true has always deleted the index on the first commit, but in older versions of Solr, the workaround was to disable replication until the full reindex had completed. This is a convenient

Re: Document Score seen in debug section and in main results section dont match

2019-02-12 Thread Erick Erickson
Sounds like: https://issues.apache.org/jira/browse/SOLR-13126 On Tue, Feb 12, 2019 at 12:12 AM Baloo wrote: > > I will try to provide more data about this issue, > > If you see attached query response , It shows > >> In 1st, 2nd document only query matched and boost queries did not match > >>

Re: misteriuos nullpointerexception while adding documents

2019-02-12 Thread Erick Erickson
bq. I disabled autocommit (both soft and hard), but used to work with a previous version of the schema. First, did you _change_ the schema without 1> deleting all the docs in the index 2> reindexing everything or better, indexing to a new collection and aliasing to it? If you changed the

Re: Createsnapshot null pointer exception

2019-02-12 Thread SOLR4189
Ok. I understood my problem. Usually I create collection with X shards and then add some Y cores. This Y cores I use like gateways or federators (my web application sends queries to load balancer that connected to Y cores only). When I create Y cores, I used this command

Re: Createsnapshot null pointer exception

2019-02-12 Thread SOLR4189
Ok. I understood my problem. Usually I create collection with X shards and then add some Y cores. This Y cores I use like gateways or federators (my web application sends queries to load balancer that connected to Y cores only). When I create Y cores, I used this command

Re: Can I use solr cloud replica for reindexing

2019-02-12 Thread Erick Erickson
No. All replicas in a collection use the same schema. You can create a new _collection_, index to that, then use collections API CREATEALIAS command to point to the new collection with the old name. Best, Erick On Tue, Feb 12, 2019 at 1:11 AM Alexey Ponomarenko wrote: > > Hi, I have a question

Re: Solr relevancy score different on replicated nodes

2019-02-12 Thread Erick Erickson
You really only have four 1> use exactstats. This won't guarantee precise matches, but they'll be closer 2> optimize (not particularly recommended, but if you're willing to do it periodically it'll have the stats match until the next updates). 3> use TLOG/PULL replicas and confine the requests to

Re: unable to create new threads: out-of-memory issues

2019-02-12 Thread Erick Erickson
Absolutely increase the file limit before going down other avenues. I recommend 65K. This is because I've spent way more time than I want to think about finding out that this is the problem as it can pop out in unexpected ways, ways that are totally _not_ obvious. It's one of those things

Sort order, return the first 20 results, and the last 80 results

2019-02-12 Thread Michael Tracey
Hey all, I'm interested returning 100 rows in a query, with a sort order on a tfloat field, but return the first 20 results, then the last 80 results. I'd like to do this without two requests, to keep down requests per second. Is there any way to do this in one query with function queries

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Shawn Heisey
On 2/12/2019 7:35 AM, Joe Obernberger wrote: Yesterday, we upgraded our 40 node cluster from solr 7.6.0 to solr 7.7.0.  This morning, all the nodes are using 1200+% of CPU. It looks like it's in garbage collection.  We did reduce our HDFS cache size from 11G to 6G, but other than that, no

RE: What's the deal with dataimporthandler overwriting indexes?

2019-02-12 Thread Vadim Ivanov
Hi! If clean=true then index will be replaced completely by the new import. That is how it is supposed to work. If you don't want preemptively delete your index set =false. And set =true instead of =true Are you sure about optimize? Do you really need it? Usually it's very costly. So, I'd try:

Re: unable to create new threads: out-of-memory issues

2019-02-12 Thread Walter Underwood
Create one instance of HttpSolrClient and reuse it. It is thread-safe. It also keeps a connection pool, so reusing the same one will be faster. Do you really need atomic updates? Those are much slower because they have to read the document before updating. wunder Walter Underwood

RE: unable to create new threads: out-of-memory issues

2019-02-12 Thread Vadim Ivanov
Hi! I had the same issue and found that actual problem with the file limit (in spite of the error message) To increase file limit: On Linux, you can increase the limits by running the following command as root: sysctl -w vm.max_map_count=262144 To set this value permanently, update the

Re: Docker and Solr Indexing

2019-02-12 Thread Shawn Heisey
On 2/12/2019 6:56 AM, solrnoobie wrote: I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: USUALLY, this is a harmless annoyance, not an indication of an actual problem. Some people have indicated that it causes problems when using the

Re: What's the deal with dataimporthandler overwriting indexes?

2019-02-12 Thread Emir Arnautović
Hi Joakim, This might not be what you expect but it is expected behaviour. When you do clean=true, DIH will first delete all records. That is how it works in both M/S and Cloud. The diff might be that you disabled replication or disabled auto commits in your old setup so it is not visible. You

RE: unable to create new threads: out-of-memory issues

2019-02-12 Thread Martin Frank Hansen (MHQ)
Hi Mikhail, Thanks for your help. I will try it. -Original Message- From: Mikhail Khludnev Sent: 12. februar 2019 15:54 To: solr-user Subject: Re: unable to create new threads: out-of-memory issues 1. you can jstack to find it out. 2. It might create a thread, I don't know. 3.

Re: unable to create new threads: out-of-memory issues

2019-02-12 Thread Mikhail Khludnev
1. you can jstack to find it out. 2. It might create a thread, I don't know. 3. SolrClient is definitely a subject for heavy reuse. On Tue, Feb 12, 2019 at 5:16 PM Martin Frank Hansen (MHQ) wrote: > Hi Mikhail, > > I am using Solrj but think I might have found the problem. > > I am doing a

Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Yesterday, we upgraded our 40 node cluster from solr 7.6.0 to solr 7.7.0.  This morning, all the nodes are using 1200+% of CPU. It looks like it's in garbage collection.  We did reduce our HDFS cache size from 11G to 6G, but other than that, no other parameters were changes. Top shows: top -

RE: unable to create new threads: out-of-memory issues

2019-02-12 Thread Martin Frank Hansen (MHQ)
Hi Mikhail, I am using Solrj but think I might have found the problem. I am doing a atomicUpdate on existing documents, and found out that I create a new SolrClient for each document. I guess this is where all the threads are coming from. Is it correct that when creating a SolrClient, I also

Re: unable to create new threads: out-of-memory issues

2019-02-12 Thread Mikhail Khludnev
Hello, Martin. How do you index? Where did you get this error? Usually it occurs in custom code with many new Thread() calls and usually healed with thread poling. On Tue, Feb 12, 2019 at 3:25 PM Martin Frank Hansen (MHQ) wrote: > Hi, > > I am trying to create an index on a small Linux server

Re: Docker and Solr Indexing

2019-02-12 Thread solrnoobie
I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: /opt/solr/server/solr/primaryCollectionPERF_shard1_replica9/data/index/segments_78 at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) at

Re: Docker and Solr Indexing

2019-02-12 Thread solrnoobie
I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: /opt/solr/server/solr/primaryCollectionPERF_shard1_replica9/data/index/segments_78 at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) at

unable to create new threads: out-of-memory issues

2019-02-12 Thread Martin Frank Hansen (MHQ)
Hi, I am trying to create an index on a small Linux server running Solr-7.5.0, but keep running into problems. When I try to index a file-folder of roughly 18 GB (18000 files) I get the following error from the server: java.lang.OutOfMemoryError: unable to create new native thread. >From the

Re: Get recent documents from solr

2019-02-12 Thread shruti suri
Problem got resolved using MaxFieldValueUpdateProcessorFactory. I indexed all update dates fields in one field and used this processor factory on that field to get latest date among all dates - Regards Shruti -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Increasing solr nodes

2019-02-12 Thread Hendrik Haddorp
You can use the MOVEREPLICA command: https://lucene.apache.org/solr/guide/7_6/collections-api.html Alternately you can also add another replica and then remove one of your old replicas. When you a replica you can either specify the node it shall be placed on or let Solr pick a node for you.

Re: Solr relevancy score different on replicated nodes

2019-02-12 Thread Aman Tandon
Hi Erick, Any suggestions on this? Regards, Aman On Fri, Feb 8, 2019, 17:07 Aman Tandon Hi Erick, > > I find this thread very relevant to the people who are facing the same > problem. > > In our case, we have a signals aggregation collection which is having > total of around 8 million records.

What's the deal with dataimporthandler overwriting indexes?

2019-02-12 Thread Joakim Hansson
Hi! We are currently upgrading from solr 6.2 master slave setup to solr 7.6 running solrcloud. I dont know if I've missed something really trivial, but everytime I start a full import (dataimport?command=full-import=true=true) the old index gets overwritten by the new import. In 6.2 this wasn't

Can I use solr cloud replica for reindexing

2019-02-12 Thread Alexey Ponomarenko
Hi, I have a question - https://stackoverflow.com/questions/54593171/can-i-use-solr-cloud-replica-for-reindexing . Can you help me?

Re: misteriuos nullpointerexception while adding documents

2019-02-12 Thread MUNENDRA S.N
Are you trying to set some field to null in the request?? Also, is that particular field numeric, doc valued enabled and stored set to false?? Sharing more details would help here, specifically update request and schema for those fields. Regards, Munendra S N On Tue, Feb 12, 2019 at 2:24 PM

misteriuos nullpointerexception while adding documents

2019-02-12 Thread Danilo Tomasoni
Hello all, I get this error while uploading my documents with 'set' modifier in json format. My solr version is 7.3.1. I disabled autocommit (both soft and hard), but used to work with a previous version of the schema. Someone have any clue on what's going on here? I can't reproduce the

[lucene > nori ] special characters issue

2019-02-12 Thread 유정인
Hi I'm using the "nori" analyzer. Whether it's an error or an intentional question. All special characters are filtered. Special characters stored in the dictionary are also filtered. How do I print special characters?

Re: Ignore accent in a request

2019-02-12 Thread Ere Maijala
I'm not brave enough to try char filter with such a large table, so I can't really comment on that. I gave up with char filter after running into some trouble handling cyrillic letters. At least ICUFoldingFilter is really simple to use, and with more recent Solr versions you can also use it with

Increasing solr nodes

2019-02-12 Thread neerajbhatt
Hi We have a solr cluster of 3 machines , A collection has three shards and 2 replicas so total of 9. Right now each machine has one shard leader and 2 replicas Because of index size we need to increase the cluster to 9 ,What is the best possible way to move a shard leader or replica to a new

Re: Document Score seen in debug section and in main results section dont match

2019-02-12 Thread Baloo
I will try to provide more data about this issue, If you see attached query response , It shows >> In 1st, 2nd document only query matched and boost queries did not match >> In 3rd document boost query matched and total score of 3rd document >> became 400.04 which is higher than 1st and 2nd