Re: Load suggest dictionary from non-Zookeeper file?

2019-05-08 Thread Shawn Heisey
On 5/8/2019 2:34 PM, Mikhail Khludnev wrote: It reminds me https://lucene.apache.org/solr/guide/7_6/blob-store-api.html but I don't think it's already integrated with suggester. I'm having one of of those days where I can't seem to recall things easily. With the blob store, the blobs are in

Re: Load suggest dictionary from non-Zookeeper file?

2019-05-08 Thread Shawn Heisey
On 5/8/2019 1:59 PM, Walter Underwood wrote: Our suggest dictionary is too big for Zookeeper. I’m trying to load it from an absolute path, but the Solr 6.6.1 insists on interpreting that as a Zookeeper path. Any way to disable that? I wouldn't be surprised to learn it's not possible to get

Re: Error when merging segments ("terms out of order")

2019-05-08 Thread Shawn Heisey
On 5/8/2019 10:47 AM, Alméras Yannick wrote: The problem of segments merging seems to be solved when I replace Java 11 32bit with Java 8 32bit on my prod Ubuntu server... (On my dev archlinux computer, no problem with Java 11 64bit...). It is strongly recommended to run a 64-bit version of

Re: Modify partial configsets using API

2019-05-08 Thread Shawn Heisey
On 5/8/2019 10:50 AM, Mike Drob wrote: Solr Experts, Is there an existing API to modify just part of my configset, for example synonyms or stopwords? I see that there is the schema API, but that is pretty specific in scope. Not sure if I should be looking at configset API to upload a zip with

Re: basic question about updating a docValue

2019-05-07 Thread Shawn Heisey
On 5/7/2019 7:35 AM, Shawn Heisey wrote: The field must be 'indexed="false"' as well for in-place updates to work.  If you have indexed set to true, I don't think that's going to work.  Here's the relevant documentation section: My answer was not meant to contradict the one you g

Re: basic question about updating a docValue

2019-05-07 Thread Shawn Heisey
On 5/6/2019 11:03 PM, Jerry Lin wrote: I'm new to Solr and am using Solr 8, and the Java API client. I have a score that I would like to rank my documents by, and I do not need to retrieve the values. My understanding is that I should set indexed="true", stored="false", and DocValues="true".

Re: Update documents cause multivalue fields unexpected behaviour

2019-05-07 Thread Shawn Heisey
On 5/7/2019 5:45 AM, Jie Luo wrote: For the fields that are set as stored true, query works fine, but for fields that are set as stored false, the query does not work after the documents are updated. SolrInputDocument solrInputDocument = new SolrInputDocument();

Re: Error when merging segments ("terms out of order")

2019-05-07 Thread Shawn Heisey
On 5/7/2019 5:28 AM, Alméras Yannick wrote: I don't understand a problem on my ubuntu 18.04 solr server (version 7.6.0)... When merge of segments is called, there is an error and then, the index is not writable. The logs of a failed segments merging are at the end of this message (here, I

Re: Zookeeper solr config files

2019-05-06 Thread Shawn Heisey
On 5/6/2019 9:21 AM, Kojo wrote: This is a zookeeper question, but I wonder you can help me. Is it possible to directly versioning Solr cloud config files on Zookeper using Git or any other versioning system? Or I realy need to use Zookeeper cli? When I said versioning directly on Zookeper, I

Re: Solr URI Too Long

2019-05-05 Thread Shawn Heisey
On 5/5/2019 3:40 PM, Furkan KAMACI wrote: I got a URI Too Long error and try to fix it. I'm aware of this conversation: http://lucene.472066.n3.nabble.com/URI-is-too-long-td4254270.html I've tried: Used POST instead of GET at SolrJ Can we see the actual code? I can probably verify whether

Re: [collection create & delete] collection It is not created after several hundred times when it is repeatedly deleted and created. Resolved after restarting the service.

2019-05-03 Thread Shawn Heisey
On 4/30/2019 1:38 AM, 유정인 wrote: 2019-04-27 21:50:32.043 ERROR (OverseerThreadFactory-1184-thread-4- processing-n:211.60.221.94:9080_) [ ] o.a.s.c.a.c.OverseerCollectionMessageHandler [processResponse:880] Error from shard: http://x.x.x.x:8080 org.apache.solr.client.solrj.SolrServerException:

Re: Reverse-engineering existing installation

2019-05-03 Thread Shawn Heisey
On 5/3/2019 1:44 PM, Erick Erickson wrote: Then git will let you check out any previous branch. 4.2 is from before we switched to Git, co I’m not sure you can go that far back, but 4x is probably close enough for comparing configs. Git has all of Lucene's history, and most of Solr's history,

Re: Solr long q values

2019-05-03 Thread Shawn Heisey
On 5/3/2019 1:37 PM, Erick Erickson wrote: We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either? Feel free to, raise a JIRA, but I won’t have any time to work on it…. Done.

Re: Solr long q values

2019-05-03 Thread Shawn Heisey
On 5/3/2019 2:32 AM, solrnoobie wrote: So whenever we have long q values (from a sentence to a small paragraph), we encounter some heap problems (OOM) and I guess this is normal? So my question would be is how should we handle this type of problem? Of course we could always limit the size of

Re: Accessing Solr collections at different ports - Need help

2019-05-03 Thread Shawn Heisey
On 5/3/2019 12:52 AM, Salmaan Rashid Syed wrote: I say that the nodes are limited to 4 because when I launch Solr in cloud mode, the first prompt that I get is to choose number of nodes [1-4]. When I tried to enter 7, it says that they are more than 4 and choose a smaller number. That's the

Re: Accessing Solr collections at different ports

2019-05-03 Thread Shawn Heisey
On 5/2/2019 11:47 PM, Salmaan Rashid Syed wrote: I am using Solr 7.6 in cloud mode with external zookeeper installed at ports 2181, 2182, 2183. Currently we have only one server allocated for Solr. We are planning to move to multiple servers for better sharing, replication etc in near future.

Re: Solr 7 Nodes Suck in "Gone" State

2019-04-29 Thread Shawn Heisey
On 4/29/2019 10:55 AM, Marko Babic wrote: Thanks Shawn. Yes, all Solr nodes know about all three ZK servers (i.e., the zk host string is of the form zk_a_ip:2181,zk_b_ip:2181,zk_c_ip:2181). Sorry for the dense description of things: I erred on the side of oversharing because I didn't want to

Re: [collection create & delete] collection It is not created after several hundred times when it is repeatedly deleted and created. Resolved after restarting the service.

2019-04-29 Thread Shawn Heisey
On 4/29/2019 6:36 PM, 유정인 wrote: I am using solr 7.7.1. I would like to ask about index issues. Use the solr rest api to do the indexing process. There is a problem here. My indexing process creates a collection every time I perform a batch index. When the index is completed, it alias, and

Re: SOLR / Lucene which openJDK to use

2019-04-29 Thread Shawn Heisey
On 4/29/2019 6:46 AM, Bernd Fehling wrote: while going to change my JAVA from Oracle to openJDK the big question is which distribution to take? Currently we use Oracle JDK Java SE 8 because of LTS. Next would be JDK Java SE 11 again because of LTS but now we have to change to openJDK. Any

Re: COLLECTION CREATE and CLUSTERSTATUS changes in SOLR 7.5.0

2019-04-28 Thread Shawn Heisey
On 4/28/2019 6:39 AM, ramyogi wrote: Thanks Eric, After we create a collection and copy the index from one place new place, we are doing UNLOAD core and CREATE core as below, is it wrong and we have alternative to do that ? Do not use CoreAdmin when running SolrCloud. At all. It will cause

Re: Solr 7 Nodes Suck in "Gone" State

2019-04-26 Thread Shawn Heisey
On 4/26/2019 9:23 AM, Marko Babic wrote: Apologies for bumping my own post but I'm just wondering if it'd be more appropriate for me to cut a ticket at this point rather than ask the mailing list. No, the mailing list is the correct location. If the discussion determines that there is a

Re: Solr Cloud configuration

2019-04-26 Thread Shawn Heisey
On 4/26/2019 6:14 AM, Sadiki Latty wrote: What you're saying makes sense but is it achievable without downtime? i.e: Is it achievable to change the replication factor to 2 as you suggest, and Solr puts the sharded documents back together then replicate? Just changing the replicationFactor

Re: Solr Cloud configuration

2019-04-25 Thread Shawn Heisey
On 4/25/2019 2:44 PM, Sadiki Latty wrote: - replica 1 If I need to upgrade Solr, the recommended method is to update one at a time. However, when I bring down one Solr instance I noticed that queries no longer work and I get the error "no servers hosting shard" from the node that is

Re: Zk Status Error

2019-04-23 Thread Shawn Heisey
On 4/23/2019 12:14 PM, Sadiki Latty wrote: Here are the 2 errors in the Solr Logging section RequestHandlerBase  java.lang.ArrayIndexOutOfBoundsException: 1 “java.lang.ArrayIndexOutOfBoundsException: 1     at

Re: Determing Solr heap requirments and analyzing memory usage

2019-04-23 Thread Shawn Heisey
On 4/23/2019 11:48 AM, Brian Ecker wrote: I see. The other files I meant to attach were the GC log ( https://pastebin.com/raw/qeuQwsyd), the heap histogram ( https://pastebin.com/raw/aapKTKTU), and the screenshot from top ( http://oi64.tinypic.com/21r0bk.jpg). I have no idea what to do with

Re: Determing Solr heap requirments and analyzing memory usage

2019-04-23 Thread Shawn Heisey
On 4/23/2019 6:34 AM, Brian Ecker wrote: What I’m trying to determine is (1) How much heap does this setup need before it stabilizes and stops crashing with OOM errors, (2) can this requirement somehow be reduced so that we can use less memory, and (3) from the heap histogram, what is actually

Re: Different Parsed query for solr cloud and master slave with same solr version

2019-04-23 Thread Shawn Heisey
On 4/23/2019 2:04 AM, Anant Bhargatiya wrote: We are migrating from solr 5.5 master slave to solr 8.0 cloud deployment. for exactly same index and config, we are getting different results. We'll need to see the configs you are working with as well as the raw and parsed queries from both.

Re: How to Add replicas in 7.6, similar to 5.4

2019-04-22 Thread Shawn Heisey
On 4/22/2019 7:15 AM, Raveendra Yerraguntla wrote: Should the collection APIs need to be used to create a replicas? The answer to that is an emphatic yes. ALL changes to SolrCloud collections should be handled through the Collections API. If you are creating replicas any other way, such as

Re: Replica becomes leader when shard was taking a time to update document - Solr 6.1.0

2019-04-22 Thread Shawn Heisey
On 4/22/2019 3:19 AM, vishal patel wrote: -- 228634803 maxDoc of one shard [we have 26 collection in production and 2 shards 2 replicas] 228 million is quite a lot of documents. Can you gather and share the screenshot described on the following wiki page? There seem to be two Solr

Re: solr 7.x sql query returns null

2019-04-18 Thread Shawn Heisey
On 4/18/2019 1:47 AM, David Barnett wrote: I have a large solr 7.3 collection 400m + documents. I’m trying to use the Solr JDBC driver to query the data but I get a java.io.IOException: Failed to execute sqlQuery 'select id from document limit 10' against JDBC connection 'jdbc:calcitesolr:'.

Re: Replica becomes leader when shard was taking a time to update document - Solr 6.1.0

2019-04-18 Thread Shawn Heisey
On 4/18/2019 1:00 AM, vishal patel wrote: Thanks for your reply. You are right. I checked GC log and use of GC Viewer I noticed that pause time was 111.4546597 secs. 2019-04-08T13:52:09.939+0100: 796800.430: [GC (Allocation Failure) 796800.431: [ParNew Desired survivor size 2415919104

Re: Solr8.0.0 Time Zone Issue

2019-04-18 Thread Shawn Heisey
On 4/18/2019 1:50 AM, Anuj Bhargava wrote: In mySql, date field *date_upload* shows entry as 2019-04-17 However, afer Solr Indexing *date_upload* is being shown as 2019-04-16T18:30:00Z I did change in solr.in.sh, SOLR_TIMEZONE="UTC" to SOLR_TIMEZONE="IST" and did a full-import again. The

Re: Optimizing fq query performance

2019-04-18 Thread Shawn Heisey
On 4/17/2019 11:49 PM, John Davis wrote: I did a few tests with our instance solr-7.4.0 and field:* vs field:[* TO *] doesn't seem materially different compared to has_field:1. If no one knows why Lucene optimizes one but not another, it's not clear whether it even optimizes one to be sure.

Re: Solr8.0.0 date search issue

2019-04-18 Thread Shawn Heisey
On 4/17/2019 8:45 PM, Anuj Bhargava wrote: I have an issue while searching on the Date field date_upload My Schema file has the following entry for DATE Field ** You haven't shown us the definition for pdate. If it is what the Solr examples have, then it is a DatePointField. My

Re: Optimizing fq query performance

2019-04-17 Thread Shawn Heisey
On 4/17/2019 1:21 PM, John Davis wrote: If what you describe is the case for range query [* TO *], why would lucene not optimize field:* similar way? I don't know. Low level lucene operation is a mystery to me. I have seen first-hand that the range query is MUCH faster than the wildcard

Re: Optimizing fq query performance

2019-04-17 Thread Shawn Heisey
On 4/17/2019 10:51 AM, John Davis wrote: Can you clarify why field:[* TO *] is lot more efficient than field:* It's a range query. For every document, Lucene just has to answer two questions -- is the value more than any possible value and is the value less than any possible value. The

Re: Replica becomes leader when shard was taking a time to update document - Solr 6.1.0

2019-04-17 Thread Shawn Heisey
On 4/17/2019 6:25 AM, vishal patel wrote: Why did shard1 take a 1.8 minutes time for update? and if it took time for update then why did replica1 try to become leader? Is it required to update any timeout? There's no information here that can tell us why the update took so long. My best

Re: Upgrading Solr 6.3.0 to 7.5.0 without having to re-index

2019-04-17 Thread Shawn Heisey
On 4/17/2019 3:52 AM, Ritesh Kumar wrote: Field type in old configuration - string (solr.StrField) indexed and stored set to true. Field type in new configuration - solr.SortableTextField (docValues enabled) On your schema, you have changed the field class -- from StrField to

Re: Highlighting

2019-04-15 Thread Shawn Heisey
On 4/15/2019 11:36 AM, Mike Phillips wrote: I don't understand why highlighting does not return anything but the document id. I created a core imported all my data, everything seems like it should be working. From reading the documentation I expect it to show me highlight information for

Re: Optimal RAM to size index ration

2019-04-15 Thread Shawn Heisey
On 4/15/2019 7:25 AM, SOLR4189 wrote: I have a collection with many shards. Each shard is in separate SOLR node (VM) has 40Gb index size, 4 CPU and SSD. When I run performance checking with 50GB RAM (10Gb for JVM and 40Gb for index) per node and 25GB RAM (10Gb for JVM and 15Gb for index), I get

Re: Optimizing fq query performance

2019-04-14 Thread Shawn Heisey
On 4/13/2019 12:58 PM, John Davis wrote: We noticed a sizable performance degradation when we add certain fq filters to the query even though the result set does not change between the two queries. I would've expected solr to optimize internally by picking the most constrained fq filter first,

Re: Shard and replica went down in Solr 6.1.0

2019-04-14 Thread Shawn Heisey
On 4/13/2019 9:29 PM, vishal patel wrote: 2> In production, lots of documents come for indexing within a second.If i do hard commit interval to 60 seconds then in less times open searchers when hard commit execute. Is it ohk for performance? The autoCommit configuration should have

Re: How to prevent solr from deleting cores when getting an empty config from zookeeper

2019-04-11 Thread Shawn Heisey
On 4/11/2019 6:44 PM, Koen De Groote wrote: I gathered a solr log from 7.6.0 at TRACE level. Then I replicated the experiment with 6.6.5 and with that version, the directories were not deleted. Log also included. The audit log is from solr7. The deletes start at 01:51:48, which translates to

Re: high cpu threads (solr 7.5)

2019-04-11 Thread Shawn Heisey
On 4/11/2019 1:03 PM, Hari Nakka wrote: I mean the light weight processes (lwp) which were taking high cpu. I pulled the actual threads taking high cpu. full thread dump: *tdump.out* linux lwps: *high-cpu.out* top high cpu lwps mapped to thread nid: *high-cpu-dump.out (included threads taking

Re: How to prevent solr from deleting cores when getting an empty config from zookeeper

2019-04-11 Thread Shawn Heisey
On 4/11/2019 2:40 PM, Koen De Groote wrote: That being explained, am I right in understanding that currently there is no way of configuring Solr so that it won't delete the folders, in this event? In my opinion, Solr should never delete cores unless it has been explicitly *ASKED* to do so

Re: Solr New version 8.1

2019-04-11 Thread Shawn Heisey
On 4/11/2019 3:30 AM, vishal patel wrote: Any one knows about tentative date of stable SOLR 8.1 release? There are never any scheduled release dates. When one of the committers decides it's time for a new release and volunteers to be the release manager, then we have a release. It

Re: high cpu threads (solr 7.5)

2019-04-11 Thread Shawn Heisey
On 4/11/2019 2:21 AM, Hari Nakka wrote: Hi Erick, We upgraded JDK to 11. No improvement. Still seeing high cpu utilization randomly. Attached the full threaddump (tdump.out)  and lwp utilization (high-cpu.out) there were more than 30 threads (high-cpu-dump.out)taking high cpu. these are

Re: How to prevent solr from deleting cores when getting an empty config from zookeeper

2019-04-11 Thread Shawn Heisey
On 4/11/2019 3:17 AM, Koen De Groote wrote: The basic steps are: set up zookeeper, set up solr root, set up solr. Create dummy collection with example data. Stop the containers. Delete the zookeeper 'version-2' folder. Recreate zookeeper container. Redo the mkroot, recreate solr container. At

Re: Solr 8.0.0 - CPU usage 100% when indexed documents

2019-04-10 Thread Shawn Heisey
On 4/9/2019 10:53 PM, vishal patel wrote: Still my CPU usage went high and my CPU has 4 core and no other application running in my machine. I was asking how many CPUs went to 100 percent, not how many CPUs you have. And I also asked how long CPU usage remains at 100 percent after indexing

Re: Which fieldType to use for JSON Array in Solr 6.5.0?

2019-04-09 Thread Shawn Heisey
On 4/9/2019 2:04 PM, Abhijit Pawar wrote: Hello Guys, I am trying to index a JSON array in one of my collections in mongoDB in Solr 6.5.0 however it is not getting indexed. I am using a DataImportHandler for this. *Here's how the data looks in mongoDB:* { "idStr" :

Re: Solr Cache clear

2019-04-09 Thread Shawn Heisey
On 4/9/2019 12:38 PM, Lewin Joy (TMNA) wrote: I just tried to go to the location you have specified. I could not see a "CACHE" . I can see the "Statistics" section. I am using Solr 7.2 on solrcloud mode. If you are trying to select a *collection* from a dropdown, you will not see this. It

Re: Solr Cache clear

2019-04-09 Thread Shawn Heisey
On 4/9/2019 11:51 AM, Lewin Joy (TMNA) wrote: Hmm. I only tried reloading the collection as a whole. Not the core reload. Where do I see the cache sizes after reload? If you do not know how to see the cache sizes, then what information are you looking at which has led you to the conclusion

Re: Interesting Grouping/Facet issue

2019-04-09 Thread Shawn Heisey
On 4/9/2019 7:03 AM, Erie Data Systems wrote: Solr 8.0.0, I have a HASHTAG string field I am trying to facet on to get the most popular hashtags (top 100) across many sources. (SITE field is string) /select?facet.field=hashtag=on=0=%2Bhashtag:*%20%2BDT:[" . date('Y-m-d') . "T00:00:00Z+TO+" .

Re: Solr 8.0.0 - CPU usage 100% when indexed documents

2019-04-09 Thread Shawn Heisey
On 4/8/2019 11:00 PM, vishal patel wrote: Sorry my mistake there is no class of that. I have add the data using below code. CloudSolrServer cloudServer = new CloudSolrServer(zkHost); cloudServer.setDefaultCollection("actionscomments"); cloudServer.setParallelUpdates(true); List docs = new

Re: Sql entity processor sortedmapbackedcache out of memory issue

2019-04-09 Thread Shawn Heisey
On 4/8/2019 11:47 PM, Srinivas Kashyap wrote: I'm using DIH to index the data and the structure of the DIH is like below for solr core: 16 child entities During indexing, since the number of requests being made to database was high(to process one document 17 queries) and was utilizing most

Re: Solr Cache clear

2019-04-08 Thread Shawn Heisey
On 4/8/2019 2:14 PM, Lewin Joy (TMNA) wrote: How do I clear the solr caches without restarting Solr cluster? Is there a way? I tried reloading the collection. But, it did not help. When I reload a core on a test setup (solr 7.4.0), I see cache sizes reset. What evidence are you seeing that

Re: SOLR Text Field

2019-04-08 Thread Shawn Heisey
On 4/8/2019 10:27 AM, Dave Beckstrom wrote: SOLR really should ship with a sample text field defined even if commented out and only for example purposes only. That would have been most helpful. Even a FAQ somewhere would have been helpful. There are two example configs in the latest version

Re: Moving index from stand-alone Solr 6.6.0 to 3 node Solr Cloud 6.6.0 with Zookeeper

2019-04-08 Thread Shawn Heisey
On 4/8/2019 10:06 AM, Shawn Heisey wrote: * Make sure you have a copy of the source index directory. * Do not copy the tlog directory from the source. * Create the collection in the target cloud. * Shut down the target cloud completely. * Delete all the index directories in the cloud. * Copy

Re: Moving index from stand-alone Solr 6.6.0 to 3 node Solr Cloud 6.6.0 with Zookeeper

2019-04-08 Thread Shawn Heisey
On 4/8/2019 8:59 AM, kevinc wrote: I have reindexed to a single Solr 6.6.0 index and spun up a new 3 node Solr cluster with 1 shard and replication factor of 3. I want to copy over the index and have it replicate to the rest of the cluster. I have taken a copy of the data directory from the

Re: Solr 8.0.0 - CPU usage 100% when indexed documents

2019-04-08 Thread Shawn Heisey
On 4/8/2019 7:22 AM, vishal patel wrote: I have created two solr shards with 3 zoo keeper. First do upconfig in zoo keeper then start the both solr with different port then create a "actionscomments" collection using API call. When I indexed one document in actionscomments, my CPU utilization

Re: I it possible to configure solr to show time stamps without the 'Z'- character in the end

2019-04-08 Thread Shawn Heisey
On 4/8/2019 4:38 AM, Miettinen Jaana (STAT) wrote: I have a problem in solr: I should add several (old) time stamps into my solr documents, but all of them are in local time (UTC+2 or UTC+3 depending on day-light-saving situation). As default solr excepts all time stamps to be in UTC-time

Re: SOLR Text Field

2019-04-06 Thread Shawn Heisey
On 4/6/2019 6:59 AM, Dave Beckstrom wrote: I'm really hating SOLR. All I want is to define a text field that data can be indexed into and which is searchable. Should be super simple. But I run into issue after issue. I'm running SOLR 7.3 because it's compatible with the version of NUTCH I'm

Re: SolrCloud with separate JAVA instances

2019-04-03 Thread Shawn Heisey
On 4/3/2019 8:16 AM, Bernd Fehling wrote: If I now use the Admin GUI at port 8983 and select "Cloud"->"Graph" I see both collections. Also with Admin GUI at port port 7574. And I can select both collection in "Collection Selection" dropdown box. Why and is this how it should be? I thought

Re: Solr 7.5 - Indexing Failing due to "IndexWriter is Closed"

2019-04-01 Thread Shawn Heisey
4/1/2019 5:40 PM, Aroop Ganguly wrote: Thanks Shawn, for the initial response. Digging into a bit, I was wondering if we’d care to read the inner most stack. From the inner most stack it seems to be telling us something about what trigger it ? Ofcourse, the system could have been overloaded

Re: Solr 7.5 - Indexing Failing due to "IndexWriter is Closed"

2019-04-01 Thread Shawn Heisey
On 4/1/2019 4:44 PM, Aroop Ganguly wrote: I am facing this issue again.The stack mentions Heap space issue. Are the document sizes too big ? Not sure what I should be doing here; As on the solr admin ui I do not see jvm being anywhere close to being full. Any advise on this is greatly

Re: AW: Solr 8.0.0 + IndexUpgrader

2019-04-01 Thread Shawn Heisey
On 4/1/2019 9:47 AM, Herbert Hackelsberger wrote: So, am I correct: - When using the IndexUpgrader, it will make the Index usable in the actual version, without all new features. - Using the Index Upgrader in the future again on the next major version will again result in this error situation.

Re: Solr 8.0.0 + IndexUpgrader

2019-04-01 Thread Shawn Heisey
On 4/1/2019 9:19 AM, Herbert Hackelsberger wrote: I tried to upgrade my test index from Solr 7.7.1 to Solr 8.0.0. The file segments_4h7 already contains the string Lucene70. I upgraded before with this command: java -cp lucene-core-7.7.1.jar;lucene-backward-codecs-7.7.1.jar

Re: Error on text field

2019-03-26 Thread Shawn Heisey
On 3/26/2019 10:56 AM, Dave Beckstrom wrote: I'm using Nutch to crawl and index some content. It failed on a SOLR field defined as a text field when it was trying to insert the following value for the field: What precisely does "failed" mean? Can you share the complete error? It will likely

Re: Help with slow retrieving data

2019-03-24 Thread Shawn Heisey
On 3/24/2019 12:11 PM, Wendy2 wrote: Thank you very much for your response! Here is a screen shot. Is the CPU an issue? You said that your index is 6GB, but the process listing is saying that you have more than 30GB of index data being managed by Solr. There's a discrepancy somewhere.

Re: Help with slow retrieving data

2019-03-24 Thread Shawn Heisey
On 3/24/2019 7:16 AM, Wendy2 wrote: Hi Solr users:I use Solr 7.3.1 and 150,000 documents and about 6GB in total. When I try to retrieve 2 ids (4 letter code, indexed and stored), it took 17s to retrieve 1.14M size data. I tried to increase RAM and cache, but Can you get the screenshot

Re: Java 9 & solr 7.7.0

2019-03-23 Thread Shawn Heisey
On 3/23/2019 8:12 AM, Jay Potharaju wrote: Can I use java 9 with 7.7.0. I am planning to test if fixes issue with high cpu that I am running into. https://bugs.openjdk.java.net/browse/JDK-8129861 Was solr 7.7 tested with java 9? The info for the 7.0.0 release said it was qualified with Java

Re: Solr query high response time

2019-03-22 Thread Shawn Heisey
On 3/22/2019 7:52 AM, Rajdeep Sahoo wrote: My solr query sometime taking more than 60 sec to return the response . Is there any way I can check why it is taking so much time . Please let me know if there is any way to analyse this issue(high response time ) .Thanks With the information

Re: [CAUTION] Re: Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-22 Thread Shawn Heisey
On 3/22/2019 2:02 AM, Hubert-Price, Neil wrote: One other question Is there a system level configuration that can change the default for the sow= parameter? Can it be flipped to have the default set to true? Any parameter can be put into the query handler definition. In defaults,

Re: Upgrading tika

2019-03-20 Thread Shawn Heisey
On 3/20/2019 8:24 AM, Tannen, Lev (USAEO) [Contractor] wrote: I still need your advice. The program I have to fix uses class AutoDetectParser along with Solrj for parsing PDF files before sending the result to the solr server. To do this it linked two tika jar files taken from the solr

Re: Incorrect Guava version in maven repository

2019-03-19 Thread Shawn Heisey
On 3/19/2019 6:17 PM, Amber Liu wrote: When I try to upgrade Guava that SOLR depends on, I notice the Guava version listed in maven repository for SOLR is 14.0.1 ( https://mvnrepository.com/artifact/org.apache.solr/solr-core/8.0.0). I also noticed that there is a Jira issue resolved in SOLR

Re: is df needed for SolrCloud replication?

2019-03-19 Thread Shawn Heisey
On 3/19/2019 4:48 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: I recently noticed that my solr.log files have been getting the following error message: o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name specified in query and no default specified via 'df' param The

Re: Need information on EofExceptions in solr 4.8.1

2019-03-19 Thread Shawn Heisey
On 3/19/2019 10:39 AM, Vijay Rawlani wrote: We are using solr 4.8.1 in our project. We are observing following EofExceptions in solr. It would be helpful for us to know in what situations we might land up with this. Can we get rid of this with any solr configuration or is there any way forward

Re: Upgrading tika

2019-03-19 Thread Shawn Heisey
On 3/19/2019 9:03 AM, levtannen wrote: Could anybody suggest me what files do I need to use the latest version of Tika and where to find them? This mailing list is solr-user. Tika is an entirely separate project from Solr within the Apache Foundation. To get help with Tika, you'll need to

Re: SolrJ - CoreAdminRequest / PingRequest and HttpSolrClient baseUrl

2019-03-18 Thread Shawn Heisey
On 3/18/2019 2:04 PM, Markus Schuch wrote: * CoreAdminRequest - is only working when no particular core is given * PingRequest - is only working when a particular core is given That sounds like what I would expect to happen. Using HTTP (not SolrJ), the CoreAdmin API is accessed

Re: Solr 7.5 DeleteShard not working when all cores are down

2019-03-14 Thread Shawn Heisey
On 3/14/2019 12:47 PM, Aroop Ganguly wrote: I am trying to delete a shard from a collection using the collections api for the same. On the solr ui,  all the replicas are in “downed” state. However, when I run the delete shard command: /solr/admin/collections?action=DELETESHARD=x=shard84 I

Re: FieldTypes and LowerCase

2019-03-14 Thread Shawn Heisey
On 3/14/2019 8:49 AM, Moyer, Brett wrote: Thanks Shawn, " Analysis only happens to indexed data" Being the case when the data gets Indexed, then wouldn't the Analyzer kickoff and lowercase the URL? The analyzer I have defined is not set for Index or Query, so as I understand it will fire

Re: Solr collection indexed to pdf in hdfs throws error during solr restart

2019-03-14 Thread Shawn Heisey
On 3/14/2019 1:13 AM, VAIBHAV SHUKLA shuklavaibha...@yahoo.in wrote: When I restart Solr it throws the following error. Solr collection indexed to pdf in hdfs throws error during solr restart. Error Caused by: org.apache.lucene.store.LockObtainFailedException: Index dir

Re: Commits and new document visibility

2019-03-14 Thread Shawn Heisey
On 3/14/2019 8:23 AM, Christopher Schultz wrote: I believe that the only thing I want to do is to set the autoSoftCommit value to something "reasonable". I'll probably start with maybe 15000 (15sec) to match the hard-commit setting and see if we get any complaints about delays between "save" and

Re: FieldTypes and LowerCase

2019-03-14 Thread Shawn Heisey
On 3/14/2019 7:47 AM, Moyer, Brett wrote: I'm using the below FieldType/Field but when I index my documents, the URL is not being lower case. Any ideas? Do I have the below wrong? Example: http://connect.rightprospectus.com/RSVP/TADF Expect: http://connect.rightprospectus.com/rsvp/tadf

Re: What causes new searcher to be created?

2019-03-10 Thread Shawn Heisey
On 3/9/2019 8:24 PM, John Davis wrote: I couldn't find an answer to this in the docs: if openSearcher is set to false in the autocommit with no softcommits, what triggers a new one to be created? My assumption is that until a new searcher is created all the newly indexed docs will not be

Re: Delay searches till log replay finishes

2019-03-08 Thread Shawn Heisey
On 3/8/2019 10:44 AM, Rahul Goswami wrote: 1) Is there currently a configuration setting in Solr that will trigger the first option you mentioned ? Which is to not serve any searches until tlogs are played. If not, since instances shutting down abruptly is not very uncommon, would a JIRA to

Re: Question on Solr/WordPress Integration

2019-03-01 Thread Shawn Heisey
On 3/1/2019 10:25 AM, Paul Buiocchi wrote: I have a couple of questions about Solr /Wordpress integration - You would need to talk to the person who wrote the plugin for Wordpress that integrates with Solr. If they indicate that a question can only be answered by the Solr project, then

Re: Old searcher to new searcher

2019-03-01 Thread Shawn Heisey
On 3/1/2019 4:42 AM, Amjad Khan wrote: We are trying to extend AbstractSolrEventListener class and override newSearcher method. Was curious to know if we can copy the existing searcher cache to new searcher instead of executing the query receiving from solrconfig.. Because we are not sure

Re: Porter Stem filter and employing

2019-03-01 Thread Shawn Heisey
On 3/1/2019 4:38 AM, Marisol Redondo wrote: When using the PorterStemFilter, I saw that the work "employing" is change to "emploi" and my document is not found in the query to solr because of that. This also happens with other words that finish in -ying as annoying or deploying. It there any

Re: High Availability with two nodes

2019-02-26 Thread Shawn Heisey
On 2/26/2019 2:39 AM, Andreas Mock wrote: currently we are looking at Apache Solr as a solution for searching. One important component is high availability. I digged around finding out that HA is built in via SolrCloud which means I have to install ZooKeeper in a production environment which

Re: questions regrading stored fields role in query time

2019-02-26 Thread Shawn Heisey
On 2/26/2019 1:34 AM, Saurabh Sharma wrote: Now we want to do partial updates.I went through the documentation and found that all the fields should be stored or docValues for partial updates. I have few questions regarding this? 1) In case i am just fetching only 1 field while making query.What

Re: SOLR Tokenizer “solr.SimplePatternSplitTokenizerFactory” splits at unexpected characters

2019-02-26 Thread Shawn Heisey
On 2/26/2019 12:18 AM, Stephan Damson wrote: If we take the example input "operative", the analyzer shows that during indexing, the input gets split into the tokens "ope", "a" and "ive", that is the tokenizer splits at the characters "r" and "t", and not at the expected whitespace characters

Re: how to get high-availability for Solr csv update handler?

2019-02-25 Thread Shawn Heisey
On 2/25/2019 11:15 AM, Ganesh Sethuraman wrote: We are using Solr Cloud 7.2.1. We are using Solr CSV update handler to do bulk update (several Millions of docs) in to multiple collections. When we make a call to the CSV update handler using curl command line (as below), we are pointing to single

Re: SolrCloud fails to restart after rebooting

2019-02-23 Thread Shawn Heisey
On 2/23/2019 2:29 PM, abhishek_itengg wrote: 1) I verified the presence of solr.xml file at SOLR_HOME directory on all the three servers. 2) I am starting the SOLR nodes in the cloud mode. Its configured as NSSM windows service. solr start -cloud -f -p 8983 -z

Re: dynamic field issue

2019-02-21 Thread Shawn Heisey
On 2/21/2019 8:01 AM, Midas A wrote: How many dynamic field we can create in solr ?. is there any limitation ? Is indexing dynamic field can increase heap memory on server . At the Lucene level, there is absolutely no difference between a standard field and a dynamic field. The difference in

Re: Newbie question - Error loading an existing config file

2019-02-20 Thread Shawn Heisey
On 2/20/2019 11:07 AM, Greg Robinson wrote: Lets try this: https://imgur.com/a/z5OzbLW What I'm trying to do seems pretty straightforward: 1. Install Solr Server 7.4 on Linux (Completed) 2. Connect my Drupal 7 site to the Solr Server and use it for indexing content My understanding is that I

Re: SOLR 7.5.0 (Migrate from 5.3.1 to 7.5.0)

2019-02-13 Thread Shawn Heisey
On 2/12/2019 9:25 PM, ramyogi wrote: [test_shard20_replica_n38] PERFORMANCE WARNING: Overlapping onDeckSearchers=6 2/12/2019, 1:45:39 PM WARN true x:test_shard20_replica_n38 DirectUpdateHandler2 Starting optimize... Reading and rewriting the entire index! Use with care. Eventhough index is

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Shawn Heisey
On 2/12/2019 7:35 AM, Joe Obernberger wrote: Yesterday, we upgraded our 40 node cluster from solr 7.6.0 to solr 7.7.0.  This morning, all the nodes are using 1200+% of CPU. It looks like it's in garbage collection.  We did reduce our HDFS cache size from 11G to 6G, but other than that, no

Re: Docker and Solr Indexing

2019-02-12 Thread Shawn Heisey
On 2/12/2019 6:56 AM, solrnoobie wrote: I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: USUALLY, this is a harmless annoyance, not an indication of an actual problem. Some people have indicated that it causes problems when using the

Re: How to stop a new slave from serving request until it has replicated index the first time.

2019-02-06 Thread Shawn Heisey
On 2/6/2019 9:13 AM, Pushkar Raste wrote: In the master/slave setup, as soon as I start a new slave it starts to serve request. Often the searches result in no documents being found as index has not been replicated yet. Is there a way to stop replica from serving request (marking node unhealthy)

<    1   2   3   4   5   6   7   8   9   10   >