Re: Index optimization takes too long

2018-11-02 Thread Shawn Heisey
On 11/2/2018 5:00 PM, Wei wrote: After a recent schema change, it takes almost 40 minutes to optimize the index. The schema change is to enable docValues for all sort/facet fields, which increase the index size from 12G to 14G. Before the change it only takes 5 minutes to do the optimization.

Re: SolrCloud performance

2018-11-02 Thread Shawn Heisey
On 11/2/2018 1:38 PM, Chuming Chen wrote: I am running a Solr cloud 7.4 with 4 shards and 4 nodes (JVM "-Xms20g -Xmx40g”), each shard has 32 million documents and 32Gbytes in size. A 40GB heap is probably completely unnecessary for an index of that size.  Does each machine have one replica

Re: TLOG replica stucks

2018-11-02 Thread Shawn Heisey
On 11/2/2018 3:12 AM, Vadim Ivanov wrote: It seems to me that issue related with: - restart solr node - rebalance leader - reload collection - reload core (Core admin is not forbidden but seems obsolete in SolrCloud) In SolrCloud, CoreAdmin is an expert option.  Many of the things that the

Re: SolrCloud scaling/optimization for high request rate

2018-10-30 Thread Shawn Heisey
On 10/29/2018 7:24 AM, Sofiya Strochyk wrote: Actually the smallest server doesn't look bad in terms of performance, it has been consistently better that the other ones (without replication) which seems a bit strange (it should be about the same or slightly worse, right?). I guess the memory

Re: SolrCloud scaling/optimization for high request rate

2018-10-30 Thread Shawn Heisey
On 10/29/2018 8:56 PM, Erick Erickson wrote: The interval between when a commit happens and all the autowarm queries are finished if 52 seconds for the filterCache. seen warming that that long unless something's very unusual. I'd actually be very surprised if you're really only firing 64

Re: partial update in solr

2018-10-29 Thread Shawn Heisey
On 10/29/2018 7:40 AM, Zahra Aminolroaya wrote: Thanks Alex. I want to have a query for atomic update with solrj like below:

Re: SolrCloud scaling/optimization for high request rate

2018-10-27 Thread Shawn Heisey
On 10/26/2018 9:55 AM, Sofiya Strochyk wrote: We have a SolrCloud setup with the following configuration: I'm late to this party.  You've gotten some good replies already.  I hope I can add something useful. * 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon

Re: Edismax query returning the same number of results using AND as it does with OR

2018-10-26 Thread Shawn Heisey
Followup: I had a theory that Nicky tested, and I think what was observed confirms the theory. TL;DR: In previous versions, I think there was a bug where the presence of boolean operators caused edismax to ignore the mm parameter, and only rely on the boolean operator(s). After that bug got

Re: Regarding multi keyword search

2018-10-23 Thread Shawn Heisey
On 10/23/2018 8:20 AM, Gauri Dhawan wrote: I have been facing an issue for quite some time and haven't been able to come to a solution as of yet. We are trying to implement search on our platform and all our data is stored in Solr. I have a field `description` which is the field where I have to

Re: Slow import from MsSQL and down cluster during process

2018-10-23 Thread Shawn Heisey
On 10/23/2018 7:15 AM, Daniel Carrasco wrote: Hello, Thanks for your response. We've already thought about that and doubled the instances. Just now for every Solr instance we've 60GB of RAM (40GB configured on Solr), and a 16 Cores CPU. The entire Data can be stored on RAM and will not fill

Re: AW: AW: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-23 Thread Shawn Heisey
On 10/22/2018 9:44 PM, Clemens Wyss DEV wrote: On 10/22/2018 6:15 AM, Shawn Heisey wrote: autoSoftCommit is pretty aggressive . If your commits are taking 1-2 seconds or les well, some take minutes (re-index)! Are you absolutely sure that you have commits taking that much time?  I'm

Re: Internal Solr communication question

2018-10-23 Thread Shawn Heisey
On 10/23/2018 9:31 AM, Fernando Otero wrote: Hey all I'm running some tests on Solr cloud (10 nodes, 3 shards, 3 replicas), when I run the queries I end up seeing 7x traffic ( requests / minute) in Newrelic. Could it be that the internal communication between nodes is done through HTTP

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Shawn Heisey
On 10/22/2018 7:32 PM, Susheel Kumar wrote: Hi Shawn, you meant ZK GC log correct? There was another potential cause I was thinking of, but when I got to where I was going to list them in the previous message, I could not for the life of me remember what the other one was. I just

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Shawn Heisey
On 10/22/2018 7:32 PM, Susheel Kumar wrote: Hi Shawn, you meant ZK GC log correct? No, the GC log from Solr.  A heap that's too small could happen to ZK as well, but I would expect that problem more on the Solr side. You could try increasing the heap size to see if that makes any

Re: Query to multiple collections

2018-10-22 Thread Shawn Heisey
On 10/22/2018 1:26 PM, Chris Ulicny wrote: There weren't any particular problems we ran into since the client that makes the queries to multiple collections previously would query multiple cores using the 'shards' parameter before we moved to solrcloud. We didn't have any complicated sorting or

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Shawn Heisey
On 10/22/2018 3:31 PM, Susheel Kumar wrote: Hello, I am seeing "ZookeeperServer not running" WARM messages in zookeeper logs which is causing the Solr client connections to timeout... What could be the problem? ZK: 3.4.10 Zookeeper.out == For help with the ZK server log, you'll need to

Re: Integrate nutch with solr

2018-10-22 Thread Shawn Heisey
On 10/22/2018 3:26 PM, Dinesh Sundaram wrote: Thanks Shawn for the reply, yes I do have some questions on the solr too. can you please share the steps for solr side to integate the nutch or no steps are needed in solr? Since I have no idea what has to happen on the nutch side, I really can't

Re: SOLR External Id field

2018-10-22 Thread Shawn Heisey
On 10/22/2018 12:46 PM, Rathor, Piyush (US - Philadelphia) wrote: We are storing data in solr. Please let me know on the following: * How can we set a field as external id which can be used for update. * What operation/ query needs to sent to update the same external id record.

Re: AW: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-22 Thread Shawn Heisey
On 10/21/2018 11:10 PM, Clemens Wyss DEV wrote: For the UpdateRequests it is the "commitWithinMs"-parameter? To me this parameter sounds like telling the solr-server I need to see this data within "x ms". As we have autoCommit and autoSoftCommit The commitWithin parameter is effectively

Re: Is there a tool to directly index hdfs files to solr?

2018-10-22 Thread Shawn Heisey
On 10/18/2018 6:17 AM, shreck wrote: why remove "\solr\contrib\map-reduce" lib from solr6.6.1? Those contrib modules were removed for two primary reasons: * They are available elsewhere. * The copy included with Solr was not being maintained. See this issue:

Re: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-21 Thread Shawn Heisey
On 10/21/2018 11:43 AM, Clemens Wyss DEV wrote: If I omit the core in the url upon creation of the SolrClient, where can I then "indicate" the core? You do it with the request, not with the client.

Re: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-21 Thread Shawn Heisey
On 10/21/2018 10:13 AM, Clemens Wyss DEV wrote: Just upgrading from 6.6 to 7.5 and am now seeing many "Connection evcitor"-threads which are all Thread.slee()ing ... What's the stacktrace on those threads?  If they're sleeping, then it's unlikely that there's any real contribution to system

Re: Response time creep in Solr

2018-10-19 Thread Shawn Heisey
On 10/19/2018 7:57 AM, Roopa Rao wrote: From the past few months there has been a steady increase in the Solr response time in our application, yes there are enhancements and index size increase. How to approach this issue to find the root cause for this slow and constant increase? What

Re: Integrate nutch with solr

2018-10-18 Thread Shawn Heisey
On 10/18/2018 12:35 PM, Dinesh Sundaram wrote: Can you please share the steps to integrate nutch 2.3.1 with solrcloud 7.1.0. You will need to speak to the nutch project about how to configure their software to interact with Solr.  If you have questions about Solr itself, we can answer those.

Re: Constant Score

2018-10-17 Thread Shawn Heisey
On 10/17/2018 5:06 PM, Vincenzo D'Amore wrote: I tried to use constant score into qf parameter but I had an exception. Is this normal? The qf parameter actually is something like this: field1^3 field2^4 field3^5... etc. You didn't actually say, but it sounds like you're trying to use

Re: Casting from schemaless to classic schema

2018-10-17 Thread Shawn Heisey
On 10/17/2018 5:36 AM, Zahra Aminolroaya wrote: What would be the challenges that I will confront with as my schemaless collection has some indexed documents in it? If the schema itself (the file named managed-schema that you might be renaming to schema.xml) hasn't changed, then the existing

Re: Solr performing Calculations vs. Pulling data Values Directly From DB Question

2018-10-17 Thread Shawn Heisey
On 10/17/2018 9:19 AM, Joseph Costello - F Reports wrote: Any feedback from the group on the question below. The question was will solr performing distance calculations (10,000++) on the fly, perform faster than SQL query simply pulling pre-calculated distance values directly from the

Re: Solr Cloud - .NET Client

2018-10-17 Thread Shawn Heisey
On 10/17/2018 7:19 AM, Tech Support wrote: We need to implement "Solr" search engine with "Solr Cloud" in our running/existing .NET Application (4.5 VS2012). Is there any .NET client (recomended) with Solr Cloud operations. We have tried with "SolrNet" .net client available in the GitHub

Re: Device I/O trouble with solr 7.5

2018-10-16 Thread Shawn Heisey
On 10/16/2018 6:04 AM, zoolette wrote: We are today running under SOLR 6.6 on our production environnement. On the end of august, i planned to upgrade SOLR to 7.4 (7.5 since that moment) but I encounter some trouble. Our master SOLR is replicated to a slave SOLR. I tried to upgrade the replica

Re: Solr Shards down for unknown reason

2018-10-15 Thread Shawn Heisey
On 10/15/2018 1:30 PM, Dasarathi Minjur wrote: We have a Hadoop cluster with Solr 6.3 running as service. After an OS security patching, when the cluster was restarted, Solr Cloud is up but the shards are down all the time. No specific messages in Solr.log or console logs. Tried restarting

Re: Solr 7.4.0 : Question related to Stalling of unit tests

2018-10-15 Thread Shawn Heisey
On 10/15/2018 11:00 AM, vishal ghugare wrote: I have built Solr 7.4.0 with upgraded version (25.0-jre/26.0-jre) of guava dependency. When unit tests are run against solr with guava 25.0-jre or 26.0-jre, some of the unit tests stall indefinitely and the testing never finishes up. For example,

Re: Something odd with async request status for BACKUP operation on Collections API

2018-10-14 Thread Shawn Heisey
On 10/14/2018 10:39 PM, Shalin Shekhar Mangar wrote: The responses are collected by node so subsequent responses from the same node overwrite previous responses. Definitely a bug. Please open an issue. Done. https://issues.apache.org/jira/browse/SOLR-12867 Thanks, Shawn

Re: Zookeeper external vs internal

2018-10-14 Thread Shawn Heisey
On 10/14/2018 9:31 PM, Sourav Moitra wrote: My question does running separate zookeeper ensemble in the same boxes provides any advantage over using the solr embedded zookeeper ? The major disadvantage to having ZK embedded in Solr is this:  If you stop or restart the Solr process, part of

Re: Something odd with async request status for BACKUP operation on Collections API

2018-10-14 Thread Shawn Heisey
On 10/14/2018 6:25 PM, dami...@gmail.com wrote: I had an issue with async backup on solr 6.5.1 reporting that the backup was complete when clearly it was not. I was using 12 shards across 6 nodes. I only noticed this issue when one shard was much larger than the others. There were no answers

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-14 Thread Shawn Heisey
On 10/14/2018 6:32 AM, yasoobhaider wrote: Memory Analyzer output: One instance of "org.apache.solr.uninverting.FieldCacheImpl" loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x7f60f7b38658" occupies 61,234,712,560 (91.86%) bytes. The memory is accumulated in one instance of

Re: Latency in updates.

2018-10-12 Thread Shawn Heisey
On 10/12/2018 8:46 AM, root23 wrote: We are having an issue where we are seeing latency in updates. We are on solr 6. The new documents are reflected right away but updates to existing document take sometime from 30 seconds to couple of minutes. This is some relevant things from our solrconfig.

Re: Default merge policy

2018-10-12 Thread Shawn Heisey
On 10/12/2018 8:32 AM, root23 wrote: We are on solr 6. and as per the documentation i think solr 6 uses TieredMergePolicyFactory. However we have not specified it in the following way 10 10 We still use 25. which i understand is not used by TieredMergePolicyFactory. Supplementing

Something odd with async request status for BACKUP operation on Collections API

2018-10-12 Thread Shawn Heisey
I'm working on reproducing a problem reported via the IRC channel. Started a test cloud with 7.5.0. Initially with two nodes, then again with 3 nodes.  Did this on Windows 10. Command to create a collection: bin\solr create -c test2 -shards 30 -replicationFactor 2 For these URLs, I dropped

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-11 Thread Shawn Heisey
On 10/11/2018 4:51 AM, yasoobhaider wrote: Hi Shawn, thanks for the inputs. I have uploaded the gc logs of one of the slaves here: https://ufile.io/ecvag (should work till 18th Oct '18) I uploaded the logs to gceasy as well and it says that the problem is consecutive full GCs. According to the

Re: ManagedIndexSchema Bad version when trying to persist schema

2018-10-11 Thread Shawn Heisey
On 10/11/2018 10:07 AM, Mikhail Ibraheem wrote: Hi Erick,Thanks for your reply.No, we aren't using schemaless mode.    is not explicitly declared in our solrconfig.xml Schemaless mode is not turned on by the schemaFactory config element. The default configurations that Solr ships with have

Re: Tika and Solr : rejected document due to mime type restrictions

2018-10-11 Thread Shawn Heisey
On 10/11/2018 9:06 AM, Bisonti Mario wrote: I startup tika server from command line: java -jar /opt/tika/tika-server-1.19.1.jar I configured, with ManifoldCF a connector to Solr. When I start the ingest of pdf and .xls document, I see in the tika server: so it seems that tika server process

Re: Solr JVM Memory settings

2018-10-10 Thread Shawn Heisey
On 10/10/2018 11:33 PM, Sourav Moitra wrote: Hello Shawn, Thanks for quick reply. Where precisely are you seeing the 98% usage? On the Solr web UI. Although the same Solr UI is reporting heap usage to be below 1 GB. Also I found that Solr Java is holding(VIRT) 8GB of total memory even with

Re: Solr JVM Memory settings

2018-10-10 Thread Shawn Heisey
On 10/10/2018 10:08 PM, Sourav Moitra wrote: We have a Solr server with 8gb of memory. We are using solr in cloud mode, solr version is 7.5, Java version is Oracle Java 9 and settings for Xmx and Xms value is 2g but we are observing that the RAM getting used to 98% when doing indexing. How can

Re: SolrJ does not use HTTP proxy anymore in 7.5.0 after update from 6.6.5

2018-10-10 Thread Shawn Heisey
On 10/1/2018 6:54 AM, Andreas Hubold wrote: Is there some other way to configure an HTTP proxy, e.g. with HttpSolrClient.Builder? I don't want to create an Apache HttpClient instance myself but the builder from Solrj (HttpSolrClient.Builder). Unless you want to wait for a fix for SOLR-12848,

Re: Deciding on the number of Shards and Replica

2018-10-07 Thread Shawn Heisey
On 10/7/2018 7:28 PM, Sourav Moitra wrote: I am Solr newbie. I am trying to setup three servers running both Zookeeper ensemble and Solr in cloud mode. Each server has 4 core and 16gb of RAM. To start with I have put Xmx value of 6144M to Zookeeper and Xmx value of 2048 to Solr.We have created 3

Re: Solr Cloud in recovering state & down state for long

2018-10-05 Thread Shawn Heisey
On 10/5/2018 9:15 AM, Ganesh Sethuraman wrote: I am not sure the logs and GC logs were evident from my previous mail. Re-posting it here for your reference: Here is the full Solr Log file (Note that it is in INFO mode): https://raw.githubusercontent.com/ganeshmailbox/har/master/SolrLogFile Here

Re: Connecting Solr to Nutch

2018-10-05 Thread Shawn Heisey
On 10/5/2018 7:24 AM, Timeka Cobb wrote: Good morning! The Nutch community doesn't help much..the problem I notice is where they say install Solr the first step create resources: the basicconfig file does not exist at all in the Solr packet..I can't connect because Solr is missing files that are

Re: Apache SOLR upgrade from 5.2.1 to 7.x

2018-10-05 Thread Shawn Heisey
On 10/5/2018 4:41 AM, padmanabhan1616 wrote: 1. We cannot upgrade directly from 5.x to 7.x instead upgrade to 5.5 then upgrade to 7 as there is major index format level changes taken place in 5.5 or later version. Solr 7.x cannot read indexes from 5.5.  It can only read indexes that were

Re: Solr Cloud in recovering state & down state for long

2018-10-05 Thread Shawn Heisey
On 10/5/2018 5:15 AM, Ganesh Sethuraman wrote: 1. Does GC and Solr Logs help to why the Solr replicas server continues to be in the recovering/ state? Our assumption is that Sept 17 16:00 hrs we had done ZK transaction log reading, that might have caused the issue. Is that correct? 2. Does this

Re: Filtering group query results

2018-10-04 Thread Shawn Heisey
On 10/4/2018 7:10 AM, Greenhorn Techie wrote: We have a requirement where we need to perform a group query in Solr where results are grouped by user-name (which is a field in our indexes) . We then need to filter the results based on numFound response parameter present under each group. In

Re: Modify the log directory for dih

2018-10-04 Thread Shawn Heisey
On 10/4/2018 12:30 AM, lala wrote: Hi, I am using: Solr: 7.4 OS: windows7 I start solr using a service on startup. In that case, I really have no idea where anything is on your system. There is no service installation from the Solr project for Windows -- either you obtained that from

Re: Migrate cores from 4.10.2 to 7.5.0

2018-10-03 Thread Shawn Heisey
On 10/3/2018 3:17 PM, Pure Host - Wolfgang Freudenberger wrote: Is there any way to migrate cores from 4.10.2 to 7.5.0? I guess not, but perhaps someone has an idea. ^^ In a word, no.  A specific major version of Solr is only guaranteed to read indexes built and managed *completely* by the

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-03 Thread Shawn Heisey
On 10/3/2018 8:01 AM, yasoobhaider wrote: Master and slave config: ram: 120GB cores: 16 At any point there are between 10-20 slaves in the cluster, each serving ~2k requests per minute. Each slave houses two collections of approx 10G (~2.5mil docs) and 2G(10mil docs) when optimized. I am

Re: Restoring and upgrading a standalone index to SolrCloud

2018-10-03 Thread Shawn Heisey
On 10/3/2018 10:45 AM, Shawn Heisey wrote: Here's one way to do this: Oh, and when you delete the data directory, delete the tlog directory too.  Don't copy tlog from the non-cloud install.  Solr will re-create it as long as the directory gives it permission to do so. Thanks, Shawn

Re: Restoring and upgrading a standalone index to SolrCloud

2018-10-03 Thread Shawn Heisey
On 10/3/2018 9:42 AM, Jack Schlederer wrote: I've successfully upgraded the Lucene 5 index to Lucene 6, and then to Lucene 7, Upgrading through two major versions is not guaranteed to work.  Upgrading from an index fully built by major version X-1 is supported, but if X-2 or earlier has EVER

Re: Metrics API via Solrj

2018-10-03 Thread Shawn Heisey
On 10/3/2018 6:17 AM, Jason Gerlowski wrote: NamedList respNL = response.getResponse(); NamedList metrics = (NamedList)respNL.get("metrics"); NamedList jvmMetrics = (NamedList) metrics.get("solr.jvm"); Long numClassesLoaded = (Long)

Re: Modify the log directory for dih

2018-10-03 Thread Shawn Heisey
On 10/2/2018 10:49 PM, lala wrote: Shawn Heisey-2 wrote With a change to the log4j configuration file, you can direct all logs created by the DIH classes to a separate file, no code changes needed. Since I'm a newbee regarding log4j, Can you please give me an example about how to change

Re: Solr Cloud in recovering state & down state for long

2018-10-02 Thread Shawn Heisey
On 10/2/2018 8:55 PM, Ganesh Sethuraman wrote: We are using 2 node SolrCloud 7.2.1 cluster with external 3 node ZK ensemble in AWS. There are about 60 collections at any point in time. We have per JVM max heap of 8GB. Let's focus for right now on a single Solr machine, rather than the whole

Re: Clarification about Solr Cloud and Shard

2018-10-02 Thread Shawn Heisey
On 10/2/2018 9:33 AM, Rekha wrote: Dear Solr Team, I need following clarification from you, please check and give suggestion to me, 1. I want to store and search 200 Billions of documents(Each document contains 16 fields). For my case can I able to achieve by using Solr cloud? 2. For my case

Re: Rule-based replication or sharing

2018-10-02 Thread Shawn Heisey
On 10/2/2018 9:11 AM, Chuck Reynolds wrote: Until we move to Solr 7.5 is there a way that we can control sharding with the core.properties file? It seems to me that you use to be able to put a core.properties file in the Solr home path with something like the following.

Re: Modify the log directory for dih

2018-10-02 Thread Shawn Heisey
On 10/2/2018 3:33 AM, lala wrote: I know tha Solr logs the dih operations (& most of other operations) in server\logs\solr.log file. What I want is to configure the dih requests to be logged in another path, with another name if it's possible. DIH doesn't make its own logfile.  Just like the

Re: Dynamic filters

2018-10-02 Thread Shawn Heisey
On 10/2/2018 5:03 AM, Tamás Barta wrote: Thank you for the answers! Is it possible to get the facet result and the search results with only one query? Or I have to send two queries for the Solr (one for search results and one for facets)? It only requires one query.  You just add facet

Re: Dynamic filters

2018-10-02 Thread Shawn Heisey
On 10/2/2018 4:55 AM, Tamás Barta wrote: I have been using Solr for a while for an online web store. After search a filter box appears where user can filter results by many attributes. My question is how can I do it with Solr that he filter box show only available options based on result. For

Re: autoAddReplicas – what am I missing?

2018-10-01 Thread Shawn Heisey
On 10/1/2018 11:49 AM, Michael B. Klein wrote: Then I try my experiment. 1) I bring up a 4th node (.4) and wait for it to join the cluster. I now see .1, .2, .3, and .4 in live_nodes, and .1, .2, and .3 on the graph, still as expected. 2) I kill .2. Predictably, it falls off the list of

Re: what's in cursorMark

2018-10-01 Thread Shawn Heisey
On 10/1/2018 7:36 AM, Li, Yi wrote: cursorMark appears as something like AoE/E2Zhdm9yaXRlUGxhY2UvZjg1MzMzYzEtYzQ0NC00Y2ZiLWFmZDctMzcyODFhMDdiMGY3 and the document says it is “Base64 encoded serialized representation of the sort values encapsulated by this object” I like to know if I can

Re: Creating CJK bigram tokens with ClassicTokenizer

2018-10-01 Thread Shawn Heisey
On 9/30/2018 10:14 PM, Yasufumi Mizoguchi wrote: I am looking for the way to create CJK bigram tokens with ClassicTokenizer. I tried this by using CJKBigramFilter, but it only supports for StandardTokenizer... CJKBigramFilter shouldn't care what tokenizer you're using.  It should work with

Re: SolrJ cannot get actual cause of error

2018-09-30 Thread Shawn Heisey
On 9/29/2018 3:08 AM, Ryan Qin wrote: I’m working on a project which uses solr as search engine. I found I cannot get the root cause of error from SolrJ. CloudSolrClient uses LBHttpSolrClient internally.  This client has a tendency to wrap all exceptions in the "No live SolrServers" message. 

Re: Realtime get not always returning existing data

2018-09-29 Thread Shawn Heisey
On 9/28/2018 8:11 PM, sgaron cse wrote: @Shawn We're running two instance on one machine for two reason: 1. The box has plenty of resources (48 cores / 256GB ram) and since I was reading that it's not recommended to use more than 31GB of heap in SOLR we figured 96 GB for keeping index data in OS

Re: Auto recovery of a failed Solr Cloud Node?

2018-09-28 Thread Shawn Heisey
On 9/28/2018 4:18 PM, Christopher Schultz wrote: I thought someone recently mentioned (but I cannot find a reference, sorry) that Solr would automatically restart if an OutOfMemoryError was encountered. Is that only for single-note Solr (i.e. non-cloud/ZK)? On non-windows systems, Solr

Re: Realtime get not always returning existing data

2018-09-28 Thread Shawn Heisey
On 9/28/2018 6:09 AM, sgaron cse wrote: because this is a test deployment replica is set to 1 so as far as I understand, data will not be replicated for this core. Basically we have two SOLR instances running on the same box. One on port 8983, the other on port 8984. We have 9 cores on this SOLR

Re: Faceting with a multi valued field

2018-09-27 Thread Shawn Heisey
On 9/25/2018 2:14 PM, Hanjan, Harinder wrote: Hello! When starting a new topic on the mailing list, do not reply to an existing message.  Your thread is buried within a thread originally titled "Extracting top level URL when indexing document".

Re: Realtime get not always returning existing data

2018-09-27 Thread Shawn Heisey
On 9/27/2018 11:48 AM, sgaron cse wrote: So this is a SOLR core where we keep configuration data so it is almost never written to. The statistics for the core say its been last modified 4 hours ago, yet I got doc:null from the API an hour ago. And also you don't have to have a lot of data into

Re: Json object values in solr string field

2018-09-27 Thread Shawn Heisey
On 9/27/2018 8:53 AM, Balanathagiri Ayyasamypalanivel wrote: Thanks Shawn for your prompt response. Actually we have to filter on the query time while calculate the score. The challenge here is we should not add the asset and put as static field in the index time. The asset needs to be

Re: Json object values in solr string field

2018-09-27 Thread Shawn Heisey
On 9/26/2018 12:46 PM, Balanathagiri Ayyasamypalanivel wrote: But only draw back here is we have to parse the json to do the sum of the values, is there any other way to handle this scenario. Solr cannot do that for you.  You could put this in your indexing software -- add up the numbers and

Re: Auto recovery of a failed Solr Cloud Node?

2018-09-27 Thread Shawn Heisey
On 9/27/2018 8:00 AM, Shawn Heisey wrote: On 9/27/2018 7:24 AM, Kimber, Mike wrote: I'm trying to determine if there is any health check available to determine the above and then if the issue happens then an automated mechanism in SolrCloud to restart the instance. Or is this something we

Re: Auto recovery of a failed Solr Cloud Node?

2018-09-27 Thread Shawn Heisey
On 9/27/2018 7:24 AM, Kimber, Mike wrote: I'm trying to determine if there is any health check available to determine the above and then if the issue happens then an automated mechanism in SolrCloud to restart the instance. Or is this something we have to code ourselves? As shipped by the

Re: Making Solr Indexing Errors Visible

2018-09-27 Thread Shawn Heisey
On 9/26/2018 2:39 PM, Terry Steichen wrote: Let me try to clarify a bit - I'm just using bin/post to index the files in a directory.  That indexing process produces a lengthy screen display of files that were indexed.  (I realize this isn't production-quality, but I'm not ready for production

Re: Solr Search Special Characters

2018-09-27 Thread Shawn Heisey
On 9/26/2018 10:39 PM, Rathor, Piyush (US - Philadelphia) wrote: We are facing some issues in search with special characters. Can you please help in query if the search is done using following characters: • “&” • AND • ( • ) There are two ways. 

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Shawn Heisey
On 9/26/2018 2:39 PM, Terry Steichen wrote: To the best of my knowledge, I'm not using SolrJ at all.  Just Solr-out-of-the-box.  In this case, if I understand you below, it "should indicate an error status" I think you'd know if you were using SolrJ directly.  You'd have written the indexing

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Shawn Heisey
On 9/26/2018 1:23 PM, Terry Steichen wrote: I'm pretty sure this was covered earlier.  But I can't find references to it.  The question is how to make indexing errors clear and obvious. If there's an indexing error and you're NOT using the concurrent client in SolrJ, the response that Solr

Re: Json object values in solr string field

2018-09-26 Thread Shawn Heisey
On 9/26/2018 12:20 PM, Balanathagiri Ayyasamypalanivel wrote: Currently I am storing json object type of values in string field in solr. Using this field, in the code I am parsing json objects and doing sum of the values under it. In solr, do we have any option in doing it by default when using

Re: Java version 11 for solr 7.5?

2018-09-26 Thread Shawn Heisey
On 9/26/2018 9:35 AM, Jeff Courtade wrote: My concern with using g1 is solely based on finding this. Does anyone have any information on this? https://wiki.apache.org/lucene-java/JavaBugs#Oracle_Java_.2F_Sun_Java_.2F_OpenJDK_Bugs I have never had a single problem with Solr running with the G1

Re: to cloud or not to cloud

2018-09-26 Thread Shawn Heisey
On 9/26/2018 9:45 AM, Jeff Courtade wrote: We are considering a move to solr 7.x my question is Must we use cloud? We currently do not and all is well. It seems all work is done referencing cloud implementations. You do not have to use cloud. For most people who are starting from scratch, I

Re: Rule-based replication or sharing

2018-09-25 Thread Shawn Heisey
On 9/25/2018 9:21 AM, Chuck Reynolds wrote: Each server has three instances of Solr running on it so every instance on the server has to be in the same replica set. You should be running exactly one Solr instance per server.  When evaluating rules for replica placement, SolrCloud will treat

Re: Metrics

2018-09-24 Thread Shawn Heisey
On 9/24/2018 3:43 PM, Jean-Marc Spaggiari wrote: Thanks for taking a look. My indexes are on HDFS. And I configured all the solr parameters for that. The "shard page" is when I click on a SOLR server to go to the UI, then in the dropdown on the left I select a shard (a leader one), then I click

Re: Metrics

2018-09-24 Thread Shawn Heisey
On 9/24/2018 2:05 PM, Jean-Marc Spaggiari wrote: I'm running a fairly old version of SOLR (4.x) and I found the metrics on the shard page. However, sometimes there is numbers, sometimes it's all 0. Just hitting the refresh button shows this. I'm aware of

Re: Is there an easy way to compare schemas?

2018-09-24 Thread Shawn Heisey
On 9/24/2018 9:28 AM, Michael Joyner wrote: Is there an easy way to compare schemas? When upgrading nodes, we are wanting to compare the "core" and "automatically mapped" data types between our existing schema and the new manage-schema available as part of the upgraded distrubtion. There is

Re: Connect to Nutch Core with Solr

2018-09-23 Thread Shawn Heisey
On 9/22/2018 11:21 PM, Timeka Cobb wrote: Hello there, hope all is well! I just installed both Nutch and Solr onto a Linux/Ubuntu sytems and trying to connect them through the nutch core. I notice in the wiki it said that Nutch 1.15 is compatible to Solr 7.3.0 but I installed Solr 7.4.0. How do

Re: Rule-based replication or sharing

2018-09-21 Thread Shawn Heisey
On 9/21/2018 2:07 PM, Chuck Reynolds wrote: I'm using Solr 6.6 and I want to create a 90 node cluster with a replication factor of three. I'm using AWS EC2 instances and I have a requirement to replicate the data into 3 AWS availability zones. So 30 servers in each zone and I don't see a

Re: Solr 6.x and java 8

2018-09-21 Thread Shawn Heisey
On 9/21/2018 1:23 PM, tedsolr wrote: My application environment runs java 1.8. However I'm stuck building to 1.7 for now. I can still use SolrJ 6.1 in my app as long as I only deploy the SolrJ JAR and not build it from source. Right? If you can get your code to build while targeting 1.7, I

Re: [SolrJ Client] Error calling add: connection is still allocated

2018-09-21 Thread Shawn Heisey
On 9/21/2018 10:31 AM, Christopher Schultz wrote: For those interested, it looks like I was naïvely using BasicHttpClientConnectionManager, which is totally inappropriate in a multi-user threaded environment. I switched to PooledHttpClientConnectionManager and that seems to be working much

Re: Solr 6.x and java 8

2018-09-20 Thread Shawn Heisey
On 9/20/2018 1:13 PM, tedsolr wrote: I realize that a java 8 runtime environment is required for Solr 6.x. Is it also necessary to compile to java 8 for any custom plugins running on the solr server? What about including SolrJ libraries in client code that is still compiling to 1.7? SolrJ

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 9:32 AM, Schaum Mallik wrote: ‘Then use "bin/solr zk rm" to get rid of it from ZK.‘ <— can you give the full command for this one if you don’t mind Before doing this, try what I suggested.  I am not sure that you need to mess with what you have in ZK. If you have

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 9:25 AM, Schaum Mallik wrote: ok so that’s the problem. The core.properties in the replicas under /opt/solr/server/solr. So if I remove the file from all the replica folders and also move the directories under /opt/solr/server/solr/configsets to some backup location and restart this

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 9:13 AM, Schaum Mallik wrote: In response to this mistake that I did of keeping the core.properties in the configuration directory when it was uploaded to zookeeper, how should I go about fixing it? A core.properties file in the config in ZK will not cause any problems.  It

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 8:59 AM, Shawn Heisey wrote: I just completed a test where I did that exact sequence of converting a node from non-cloud with existing indexes to cloud, and ran into the exact same errors you're seeing.  I can relay exact details of the test if required. Here's what I did

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 8:44 AM, Schaum Mallik wrote: Thank you for your detailed responses. I am still kind of confused though. Just to give you some more insight. When I first created the cloud I created to collections and the ‘-d’ option pointed to the directory where the config for the collection was

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 8:22 AM, Schaum Mallik wrote: Thanks for the response Shawn. My follow up question is how would the zookeeper ensemble know that the location of the indexes has changed? Also do I need to apply the same changes to the other 2 solr nodes which are working fine? This move is not to

Re: SolrCoreInitializationException after restart of one solr node

2018-09-20 Thread Shawn Heisey
On 9/20/2018 6:02 AM, Schaum Mallik wrote: Yeah my indexes, read and write works fine on the other two solr nodes. Since I have this setup running in prod currently what are the steps you will advice I take to resolve this issue. Starting from scratch is really not an option since it will

Re: Jetty Sqlserver config

2018-09-20 Thread Shawn Heisey
On 9/20/2018 4:10 AM, Srinivas Kashyap wrote: I'm having problem in setting up SQL server data import handler for Jetty container. Why not just define the datasource directly in the DIH config? One reason I can think of why you might not want that is that you don't want people to be able to

<    2   3   4   5   6   7   8   9   10   11   >