RE: Solr Index issue on string type while querying

2017-05-16 Thread Matt Kuiper
Your problem statement is not quite clear, however I will make a guess.

Assuming your problem is that when you remove the '>' sign from your query term 
you receive zero results, then this is actually expected behavior for field 
types that are of type string.  When searching against string fields you need 
to match the whole field value exactly.  So the '>' is needed to get a match.  
Recommend redefining or adding corresponding fields as type text_general.   
This type is tokenized and will allow for the match you are looking for.

Matt

-Original Message-
From: Padmanabhan V [mailto:padmanabhan.venkitachalapa...@gmail.com] 
Sent: Tuesday, May 16, 2017 9:33 AM
To: solr-user@lucene.apache.org
Subject: Solr Index issue on string type while querying

Hello Solr Geeks,

Am looking for some helping hands to proceed on an issue am facing now.
Here given below one record from the prepared index. i could query the fields 
without greater than symbol. but when i did query for widthSquareTube_string_mv 
& heightSquareTube_string_mv. It is not returning any result, thought there are 
records which has some values tagged similar like below. These two fields are 
dynamicFields and are of fied type:


*string.*


*Given below the query executed through solr console at Query area1.
*heightSquareTube_string_mv:>
90 - 100 mm

&

2. heightSquareTube_string_mv:"> 90 - 100 mm"


{
"indexOperationId_long": 379908,
"id": "Online/10004003x1500",
"pk": 2558081,
"wallThickessTubeSquare_string_mv": [
"3 - 5.99 mm"
],
"widthSquareTube_string_mv": [
"> 30 - 40 mm"
],
"heightSquareTube_string_mv": [
"> 90 - 100 mm"
],
"length_string_mv": [
"1000 - 1999 mm"
],
"allCategories_string_mv": [
"AL_ST",
"100",
"F000",
"F060",
"AL",
"F061"
],
"category_string_mv": [
"AL_ST",
"100",
"F000",
"F060",
"AL",
"F061"
],
"inStockFlag_boolean": true,
"baseProduct_string": "ST606010004003",
"name_text_de_de": "100 x 40 x 3 x 1500 mm",
"name_sortable_de_de_sortabletext": "100 x 40 x 3 x 1500 mm",
"autosuggest": [
"ST606010004003x1500"
],
"_version_": 1567229255468712000
}


Best Regards,
Padmanabhan.V


RE: Schemaless Mode - Multivalued

2016-12-02 Thread Matt Kuiper
Yes, the defaults makes sense.  I believe I found the chain - 
.

Thanks!

Matt


-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: Thursday, December 01, 2016 7:19 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Schemaless Mode - Multivalued

Sure, the whole chain is defined in solrconfig.xml, you can modify it any way 
you want.

The reason it is there is because when Solr see the first field occurrence, it 
does not know whether the next occurrence will be multivalued. So, it errs on 
the side of caution.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 1 December 2016 at 15:11, Matt Kuiper <matt.kui...@issinc.com> wrote:
> Hi -
>
> I have noticed when using schemaless mode that it appears that all fields 
> created by this mode are defined as Multivalued.   Is there a way to modify a 
> configuration so that the default is to be not Multivalued?
>
> Thanks,
> Matt
>


Schemaless Mode - Multivalued

2016-12-01 Thread Matt Kuiper
Hi -

I have noticed when using schemaless mode that it appears that all fields 
created by this mode are defined as Multivalued.   Is there a way to modify a 
configuration so that the default is to be not Multivalued?

Thanks,
Matt



RE: Solr Doc Site Down?

2016-12-01 Thread Matt Kuiper
Thank, Look like it is backup now.

Matt 

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Thursday, December 01, 2016 9:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr Doc Site Down?

On 12/1/2016 9:11 AM, Matt Kuiper wrote:
> FYI - This morning I am no longer able to access - 
> https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference
> +Guide

I tried it when I saw this, and it worked.  Then I popped in the Infra hipchat 
channel to ask them whether your IP (found it in your email
headers) was banned, and they indicated it wasn't, but that the site is having 
problems.  I tried the guide again, and it didn't work.

https://www.apache.org/dev/infra-contact

You can check the status of all Apache public infrastructure at the URL below, 
and it does indicate a service disruption on the Confluence Wiki.

http://status.apache.org/

Thanks,
Shawn



Solr Doc Site Down?

2016-12-01 Thread Matt Kuiper
FYI - This morning I am no longer able to access - 
https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide

Matt


RE: Reload or Reload and Solr Restart

2016-03-24 Thread Matt Kuiper
Based on what I have read, it looks like only a collection reload is needed for 
the scenario below and for that matter for applying any modifications to the 
solrconfig.xml.

Matt

From: Matt Kuiper
Sent: Wednesday, March 23, 2016 10:26 AM
To: solr-user@lucene.apache.org
Subject: Reload or Reload and Solr Restart

Hi,

I am preparing for a Production install.  In this release we will be moving 
from an AutoSuggest feature based on the Suggestor component to one based on an 
Ngram approach.  We will perform a re-index of the source data.

[<https://lucidworks.slack.com/archives/iss/p145874401758>
After updating the Solr config for each collection a collection reload (via the 
Solr collection api) will be executed. My question is whether this reload will 
clear the memory used by the Suggestor component or if a Solr restart on each 
Solr node will be necessary to clear the in-memory structure that was 
previously used by the Suggestor component.

Thanks,
Matt



Reload or Reload and Solr Restart

2016-03-23 Thread Matt Kuiper
Hi,

I am preparing for a Production install.  In this release we will be moving 
from an AutoSuggest feature based on the Suggestor component to one based on an 
Ngram approach.  We will perform a re-index of the source data.

[
After updating the Solr config for each collection a collection reload (via the 
Solr collection api) will be executed. My question is whether this reload will 
clear the memory used by the Suggestor component or if a Solr restart on each 
Solr node will be necessary to clear the in-memory structure that was 
previously used by the Suggestor component.

Thanks,
Matt



RE: Solr 4.10 Suggestor

2016-03-19 Thread Matt Kuiper
Thanks Erick!  After I posted, I did wonder if Solr would be available prior to 
the build completing.

Yes, soon looking to move to a different approach (ngrams), even though 
currently the corpus is small.

Matt

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, March 16, 2016 3:53 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Solr 4.10 Suggestor

The log files will have messages, but nothing that I know of programmatically.

Solr won't accept any requests if it's building on startup until the build is 
done though. And prior to 5.1 specifying the buildOnStartup=false was ignored. 
See SOLR-6679. That JIRA just took the suggester out of solrconfig.xml, it 
wasn't until SOLR-6845

IMO, this pretty much makes suggester unusable for a large corpus.

Erick

On Wed, Mar 16, 2016 at 2:35 PM, Matt Kuiper <matt.kui...@issinc.com> wrote:
> All,
>
> Using the Suggestor component and running Solr 4.10.  I have read that on 
> Solr startup (or commit, depending on config) the building of the Suggestor 
> can be CPU intensive and take some time.  Does anyone know how to determine 
> that the Suggestor has completed it's build?  Something to look for in the 
> logs?
>
> Thanks,
> Matt


Solr 4.10 Suggestor

2016-03-19 Thread Matt Kuiper
All,

Using the Suggestor component and running Solr 4.10.  I have read that on Solr 
startup (or commit, depending on config) the building of the Suggestor can be 
CPU intensive and take some time.  Does anyone know how to determine that the 
Suggestor has completed it's build?  Something to look for in the logs?

Thanks,
Matt


RE: Querying Dynamic Fields

2015-10-26 Thread Matt Kuiper (Springblox)
Give the following a try -

http://localhost:8983/solr/core_name/admin/luke?numTerms=0 

Matt

Matt Kuiper

-Original Message-
From: Patrick Hoeffel [mailto:patrick.hoef...@issinc.com] 
Sent: Monday, October 26, 2015 4:56 PM
To: solr-user@lucene.apache.org
Subject: Querying Dynamic Fields

I have a simple Solr schema that uses dynamic fields to create most of my 
fields. This works great. Unfortunately, I now need to ask Solr to give me the 
names of the fields in the schema. I'm using:

http://localhost:8983/solr/core/schema/fields

This returns the statically defined fields, but does not return the ones that 
were created matching my dynamic definitions, such as *_s, *_i, *_txt, etc.

I know Solr is aware of these fields, because I can query against them.

What is the secret sauce to query their names and data types?

Thanks,

Patrick Hoeffel
Senior Software Engineer
Intelligent Software Solutions (www.issinc.com<http://www.issinc.com/>)
(719) 452-7371 (direct)
(719) 210-3706 (mobile)

"Bringing Knowledge to Light"



Tips for Shard recovery?

2015-07-31 Thread Matt Kuiper
Hello,

Wondering if there are any tips on how to recover a shard when all nodes are 
down for a shard and ZK cannot find a leader ( clusterstate.json has no replica 
marked as leader for a shard)?  Bouncing the nodes does not seem to help.  
Seems like I need to reset the clusterstate

Running Solr 4.10.1

From clusterstate.json:   //Problem with Shard35
shard34:{
   range:28f5-2e13,
   state:active,
   replicas:{
 core_node49:{
   state:active,
   core:kla_collection_shard34_replica1,
   node_name:172.29.24.54:8983_solr,
   base_url:http://172.29.24.54:8983/solr;,
   leader:true}, //No such line for Shard35
 core_node71:{
   state: active ,
   core:kla_collection_shard34_replica2,
   node_name:172.29.24.53:8983_solr,
   base_url:http://172.29.24.53:8983/solr}}},
 shard35:{
   range:2e14-3332,
   state:active,
   replicas:{
 core_node51:{
   state:down,
   core:kla_collection_shard35_replica1,
   node_name:172.29.24.54:8983_solr,
   base_url:http://172.29.24.54:8983/solr},
 core_node75:{
   state:down,
   core:kla_collection_shard35_replica2,
   node_name:172.29.24.53:8983_solr,
   base_url:http://172.29.24.53:8983/solr}}},

Related log entries:
7/31/2015, 1:25:17 PM

ERROR

ZkController

Error getting leader from zk

org.apache.solr.common.SolrException: Could not get leader props
at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:950)
at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:914)
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:870)
at org.apache.solr.cloud.ZkController.register(ZkController.java:815)
at org.apache.solr.cloud.ZkController.register(ZkController.java:763)
at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:221)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /collections/kla_collection/leaders/shard39
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at 
org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:307)
at 
org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:304)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:74)
at 
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:304)
at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:928)
... 8 more


--


:org.apache.solr.common.SolrException: Error getting leader from zk for shard 
shard35

 at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:903)

 at org.apache.solr.cloud.ZkController.register(ZkController.java:815)

 at org.apache.solr.cloud.ZkController.register(ZkController.java:763)

 at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:221)

 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.solr.common.SolrException: Could not get leader props

 at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:950)

 at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:914)

 at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:870)

 ... 6 more

Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /collections/kla_collection/leaders/shard35

 at 
org.apache.zookeeper.KeeperException.create(KeeperException.java:111)

 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)

 at 
org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:307)

 at 
org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:304)

 at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:74)

 at 
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:304)

 at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:928)

 ... 8 more


Thanks,
Matt


RE: Clusterstate - state active

2015-04-09 Thread Matt Kuiper
Erick,

I do not give it an explicit name.  I use call like:

 curl 
172.29.24.47:8983/solr/admin/collections?action=ADDREPLICAcollection=kla_collectionshard=shard25node=172.29.24.75:8983_solr

It does not appear to be reusing the name, if by name you mean core_node*, or 
core.  Both are different below for replicas marked as leader true.  Note the 
second section shows recovery failed for leader

shard25:{
  range:fae1-,
  state:active,
  replicas:{
core_node48:{
  state:active,
  core:kla_collection_shard25_replica1,
  node_name:172.29.24.48:8983_solr,
  base_url:http://172.29.24.48:8983/solr;,
  leader:true},
core_node59:{
  state:active,
  core:kla_collection_shard25_replica2,
  node_name:172.29.24.47:8983_solr,
  base_url:http://172.29.24.47:8983/solr}}},


shard25:{
  range:fae1-,
  state:active,
  replicas:{
core_node149:{
  state:recovery_failed,
  core:kla_collection_shard25_replica3,
  node_name:172.29.24.75:8983_solr,
  base_url:http://172.29.24.75:8983/solr;,
  leader:true},
core_node150:{
  state:recovering,
  core:kla_collection_shard25_replica1,
  node_name:172.29.24.76:8983_solr,
  base_url:http://172.29.24.76:8983/solr}}},

Thanks,
Matt

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, April 08, 2015 6:20 PM
To: solr-user@lucene.apache.org
Subject: Re: Clusterstate - state active

Matt:

How are you creating the new replica? Are you giving it an explicit name? And 
especially is it the same name as one you've already deleted?

'cause I can't really imagine why you'd be getting a ZK exception saying the 
node already exists.

Shot in the dark here..

On Wed, Apr 8, 2015 at 4:11 PM, Matt Kuiper matt.kui...@issinc.com wrote:
 Found this error which likely explains my issue with new replicas not coming 
 up, not sure next step.  Almost looks like Zookeeper's record of a Shard's 
 leader is not being updated?

 4/8/2015, 4:56:03 PM
 ERROR
 ShardLeaderElectionContext
 There was a problem trying to register as the 
 leader:org.apache.solr.common.SolrException: Could not register as the leader 
 because creating the ephemeral registration node in ZooKeeper failed There 
 was a problem trying to register as the 
 leader:org.apache.solr.common.SolrException: Could not register as the leader 
 because creating the ephemeral registration node in ZooKeeper failed
 at 
 org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:150)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:306)
 at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
 at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
 at 
 org.apache.solr.cloud.LeaderElector.access$200(LeaderElector.java:55)
 at 
 org.apache.solr.cloud.LeaderElector$ElectionWatcher.process(LeaderElector.java:358)
 at 
 org.apache.solr.common.cloud.SolrZkClient$3$1.run(SolrZkClient.java:209)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /collections/kla_collection/leaders/shard4
 at 
 org.apache.solr.common.util.RetryUtil.retryOnThrowable(RetryUtil.java:40)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:137)
 ... 11 more
 Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: 
 KeeperErrorCode = NodeExists for /collections/kla_collection/leaders/shard4
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
 at 
 org.apache.solr.common.cloud.SolrZkClient$11.execute(SolrZkClient.java:462)
 at 
 org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:74)
 at 
 org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:459

RE: Clusterstate - state active

2015-04-08 Thread Matt Kuiper
Found this error which likely explains my issue with new replicas not coming 
up, not sure next step.  Almost looks like Zookeeper's record of a Shard's 
leader is not being updated?

4/8/2015, 4:56:03 PM
ERROR
ShardLeaderElectionContext
There was a problem trying to register as the 
leader:org.apache.solr.common.SolrException: Could not register as the leader 
because creating the ephemeral registration node in ZooKeeper failed
There was a problem trying to register as the 
leader:org.apache.solr.common.SolrException: Could not register as the leader 
because creating the ephemeral registration node in ZooKeeper failed
at 
org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:150)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:306)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
at org.apache.solr.cloud.LeaderElector.access$200(LeaderElector.java:55)
at 
org.apache.solr.cloud.LeaderElector$ElectionWatcher.process(LeaderElector.java:358)
at 
org.apache.solr.common.cloud.SolrZkClient$3$1.run(SolrZkClient.java:209)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: 
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists for /collections/kla_collection/leaders/shard4
at 
org.apache.solr.common.util.RetryUtil.retryOnThrowable(RetryUtil.java:40)
at 
org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:137)
... 11 more
Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: 
KeeperErrorCode = NodeExists for /collections/kla_collection/leaders/shard4
at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.solr.common.cloud.SolrZkClient$11.execute(SolrZkClient.java:462)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:74)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:459)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:416)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:403)
at 
org.apache.solr.cloud.ShardLeaderElectionContextBase$1.execute(ElectionContext.java:142)
at 
org.apache.solr.common.util.RetryUtil.retryOnThrowable(RetryUtil.java:34)

Matt


-Original Message-
From: Matt Kuiper [mailto:matt.kui...@issinc.com] 
Sent: Wednesday, April 08, 2015 4:36 PM
To: solr-user@lucene.apache.org
Subject: RE: Clusterstate - state active

Erick, Anshum,

Thanks for your replies!  Yes, it is replica state that I am looking at, and 
this the answer I was hoping for.  

I am working on a solution that involves moving some replicas to new Solr nodes 
as they are made available.  Before deleting the original replicas backing the 
shard, I check the replica state to make sure is active for the new replicas.  

Initially it was working pretty well, but with more recent testing I regularly 
see the shard go down.  The two new replicas go into failed recovery state 
after the original replicas are deleted, the logs report that a registered 
leader was not found for the shard.  Initially I was concerned that maybe the 
new shards were not fully synced with the leader, even though I checked for 
active state.

Now I am wondering if the new shards are somehow competing (or somehow 
reluctant )  to become leader, and thus neither become leader.  I plan to test 
just creating one new replica on a new solr node, checking for state is active, 
then deleting original replicas, and then creating second new replica.

Any thoughts?

Matt

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, April 08, 2015 4:13 PM
To: solr-user@lucene.apache.org
Subject: Re: Clusterstate - state active

Matt:

In a word, yes. Depending on the size of the index for that shard, the 
transition from Down-Recovering-Active may be too fast to catch.
If replicating the index takes a while, though, you should at least see the 
Recovering state, during which time there won't be any searches forwarded to 
that node.

Best,
Erick

On Wed, Apr 8, 2015 at 2:58 PM, Matt Kuiper matt.kui...@issinc.com wrote:
 Hello,

 When creating a new

Clusterstate - state active

2015-04-08 Thread Matt Kuiper
Hello,

When creating a new replica, and the state is recorded as active with in ZK 
clusterstate, does that mean that new replica has synched with the leader 
replica for the particular shard?

Thanks,
Matt



RE: Clusterstate - state active

2015-04-08 Thread Matt Kuiper
Erick, Anshum,

Thanks for your replies!  Yes, it is replica state that I am looking at, and 
this the answer I was hoping for.  

I am working on a solution that involves moving some replicas to new Solr nodes 
as they are made available.  Before deleting the original replicas backing the 
shard, I check the replica state to make sure is active for the new replicas.  

Initially it was working pretty well, but with more recent testing I regularly 
see the shard go down.  The two new replicas go into failed recovery state 
after the original replicas are deleted, the logs report that a registered 
leader was not found for the shard.  Initially I was concerned that maybe the 
new shards were not fully synced with the leader, even though I checked for 
active state.

Now I am wondering if the new shards are somehow competing (or somehow 
reluctant )  to become leader, and thus neither become leader.  I plan to test 
just creating one new replica on a new solr node, checking for state is active, 
then deleting original replicas, and then creating second new replica.

Any thoughts?

Matt

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, April 08, 2015 4:13 PM
To: solr-user@lucene.apache.org
Subject: Re: Clusterstate - state active

Matt:

In a word, yes. Depending on the size of the index for that shard, the 
transition from Down-Recovering-Active may be too fast to catch.
If replicating the index takes a while, though, you should at least see the 
Recovering state, during which time there won't be any searches forwarded to 
that node.

Best,
Erick

On Wed, Apr 8, 2015 at 2:58 PM, Matt Kuiper matt.kui...@issinc.com wrote:
 Hello,

 When creating a new replica, and the state is recorded as active with in ZK 
 clusterstate, does that mean that new replica has synched with the leader 
 replica for the particular shard?

 Thanks,
 Matt



RE: How to recover a Shard

2015-04-02 Thread Matt Kuiper
Thanks Erick!  Understand your warning.  Next time it occurs, I will plan to 
give it a try.  I am currently in a dev environment, so it is a safe place to 
experiment.

Matt

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Thursday, April 02, 2015 9:40 AM
To: solr-user@lucene.apache.org
Subject: Re: How to recover a Shard

Matt:

This seems dangerous, but you might be able to use the Collections API to
1 DELTEREPLICA an all but one.
2 RELOAD the collection
3 ADDREPLICA back.

I don't _like_ this much mind you as when you added the replicas back it'd 
replicate the index from the leader, but at least you might not have to take 
Solr down.

I'm not completely sure that this'll work, mind you but

Erick

On Wed, Apr 1, 2015 at 8:04 PM, Matt Kuiper matt.kui...@issinc.com wrote:
 Maybe I have been working too many long hours as I missed the obvious 
 solution of bringing down/up one of the Solr nodes backing one of the 
 replicas, and then the same for the second node.  This did the trick.

 Since I brought this topic up, I will narrow the question a bit:  Would there 
 be a way to recover without restarting the Solr node?  Basically to delete 
 one replica and then somehow declare the other replica the leader and break 
 it out of its recovery process?

 Thanks,
 Matt


 From: Matt Kuiper
 Sent: Wednesday, April 01, 2015 8:43 PM
 To: solr-user@lucene.apache.org
 Subject: How to recover a Shard

 Hello,

 I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in 
 a Recovery Failed state per the Solr Admin Cloud page.  The logs contains 
 the following type of entries for the two Solr nodes involved, including 
 statements that it will retry.

 Is there a way to recover from this state?

 Maybe bring down one replica, and then somehow declare that the remaining 
 replica is to be the leader?  Understand this would not be ideal as the new 
 leader may be missing documents that were sent its way to be indexed while it 
 was down, but would be better than having to rebuild the whole cloud.

 Any tips or suggestions would be appreciated.

 Thanks,
 Matt

 Solr node .65
 Error while trying to recover. 
 core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No 
 registered leader was found after waiting for 4000ms , collection: 
 kla_collection slice: shard6
  at 
 org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
  at 
 org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
  at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
  at 
 org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
 Solr node .64

 Error while trying to recover. 
 core=kla_collection_shard6_replica2:org.apache.solr.common.SolrExcepti
 on: No registered leader was found after waiting for 4000ms , 
 collection: kla_collection slice: shard6

  at 
 org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReade
 r.java:568)

  at 
 org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReade
 r.java:551)

  at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.jav
 a:332)

  at 
 org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)



How to recover a Shard

2015-04-01 Thread Matt Kuiper
Hello,

I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a 
Recovery Failed state per the Solr Admin Cloud page.  The logs contains the 
following type of entries for the two Solr nodes involved, including statements 
that it will retry.

Is there a way to recover from this state?

Maybe bring down one replica, and then somehow declare that the remaining 
replica is to be the leader?  Understand this would not be ideal as the new 
leader may be missing documents that were sent its way to be indexed while it 
was down, but would be better than having to rebuild the whole cloud.

Any tips or suggestions would be appreciated.

Thanks,
Matt

Solr node .65
Error while trying to recover. 
core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Solr node .64

Error while trying to recover. 
core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)

 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)

 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)



RE: How to recover a Shard

2015-04-01 Thread Matt Kuiper
Maybe I have been working too many long hours as I missed the obvious solution 
of bringing down/up one of the Solr nodes backing one of the replicas, and then 
the same for the second node.  This did the trick.

Since I brought this topic up, I will narrow the question a bit:  Would there 
be a way to recover without restarting the Solr node?  Basically to delete one 
replica and then somehow declare the other replica the leader and break it out 
of its recovery process?

Thanks,
Matt


From: Matt Kuiper
Sent: Wednesday, April 01, 2015 8:43 PM
To: solr-user@lucene.apache.org
Subject: How to recover a Shard

Hello,

I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a 
Recovery Failed state per the Solr Admin Cloud page.  The logs contains the 
following type of entries for the two Solr nodes involved, including statements 
that it will retry.

Is there a way to recover from this state?

Maybe bring down one replica, and then somehow declare that the remaining 
replica is to be the leader?  Understand this would not be ideal as the new 
leader may be missing documents that were sent its way to be indexed while it 
was down, but would be better than having to rebuild the whole cloud.

Any tips or suggestions would be appreciated.

Thanks,
Matt

Solr node .65
Error while trying to recover. 
core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Solr node .64

Error while trying to recover. 
core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)

 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)

 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)



RE: Solr Monitoring - Stored Stats?

2015-03-26 Thread Matt Kuiper
Erick, Shawn,

Thanks for your responses.  I figured this was the case, just wanted to check 
to be sure.

I have used Zabbix to configure JMX points to monitor over time, but it was a 
bit of work to get configured.  We are looking to create a simple dashboard of 
a few stats over time.  Looks like the easiest approach will be to make an app 
to make calls for these stats at a regular interval and then index results to 
Solr, and then we will able to query over desired time frames...

Thanks,
Matt

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, March 25, 2015 10:30 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr Monitoring - Stored Stats?

Matt:

Not really. There's a bunch of third-party log analysis tools that give much of 
this information (not everything exposed by JMX of course is in the log files 
though).

Not quite sure whether things like Nagios, Zabbix and the like have this kind 
of stuff built in seems like a natural extension of those kinds of tools 
though

Not much help here...
Erick

On Wed, Mar 25, 2015 at 8:26 AM, Matt Kuiper matt.kui...@issinc.com wrote:
 Hello,

 I am familiar with the JMX points that Solr exposes to allow for monitoring 
 of statistics like QPS, numdocs, Average Query Time...

 I am wondering if there is a way to configure Solr to automatically store the 
 value of these stats over time (for a given time interval), and then allow a 
 user to query a stat over a time range.  So for the QPS stat,  the query 
 might return a set that includes the QPS value for each hour in the time 
 range specified.

 Thanks,
 Matt




Solr Monitoring - Stored Stats?

2015-03-25 Thread Matt Kuiper
Hello,

I am familiar with the JMX points that Solr exposes to allow for monitoring of 
statistics like QPS, numdocs, Average Query Time...

I am wondering if there is a way to configure Solr to automatically store the 
value of these stats over time (for a given time interval), and then allow a 
user to query a stat over a time range.  So for the QPS stat,  the query might 
return a set that includes the QPS value for each hour in the time range 
specified.

Thanks,
Matt




RE: How to make SolrCloud more elastic

2015-02-12 Thread Matt Kuiper
Toke,

Thanks for your reply.  Yes, I believe I will be working with a write once 
archive.  However, my understanding is that all shards are defined up front, 
with the option to split later.

Can you describe, or point me to documentation, on how to create shards one at 
a time?  

Thanks,
Matt

-Original Message-
From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] 
Sent: Wednesday, February 11, 2015 11:47 PM
To: solr-user@lucene.apache.org
Subject: Re: How to make SolrCloud more elastic

On Wed, 2015-02-11 at 21:32 +0100, Matt Kuiper wrote:
 I am starting a new project and one of the requirements is that Solr 
 must scale to handle increasing load (both search performance and 
 index size).

[...]

 Before I got too deep, I wondered if anyone has any tips or warnings 
 on these approaches, or has scaled Solr in a different manner.

If your corpus only contains static content (e.e. log files or a write-once 
archive), you can create shards one at a time and optimize them. This lowers 
requirements for your searchers.

- Toke Eskildsen, State and University Library, Denmark




RE: How to make SolrCloud more elastic

2015-02-12 Thread Matt Kuiper
Thanks Alex. Per your recommendation I checked out the presentation and it was 
very informative.

While my problem space will not reach the scale addressed in this talk, some of 
the topics may be helpful.  Those being the improvements to shard splitting and 
the new 'migrate' API.

Thanks,
Matt

Matt Kuiper - Software Engineer
Intelligent Software Solutions
p. 719.452.7721 | matt.kui...@issinc.com 
www.issinc.com | LinkedIn: intelligent-software-solutions

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] 
Sent: Wednesday, February 11, 2015 2:31 PM
To: solr-user
Subject: Re: How to make SolrCloud more elastic

Did you have a look at the presentations from the recent SolrRevolution? E.g.
https://www.youtube.com/watch?v=nxRROble76Alist=PLU6n9Voqu_1FM8nmVwiWWDRtsEjlPqhgP

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 11 February 2015 at 15:32, Matt Kuiper matt.kui...@issinc.com wrote:
 I am starting a new project and one of the requirements is that Solr must 
 scale to handle increasing load (both search performance and index size).

 My understanding is that one way to address search performance is by adding 
 more replicas.

 I am more concerned about handling a growing index size.  I have already been 
 given some good input on this topic and am considering a shard splitting 
 approach, but am more focused on a rebalancing approach that includes 
 defining many shards up front and then moving these existing shards on to new 
 Solr servers as needed.  Plan to experiment with this approach first.

 Before I got too deep, I wondered if anyone has any tips or warnings on these 
 approaches, or has scaled Solr in a different manner.

 Thanks,
 Matt


RE: How to make SolrCloud more elastic

2015-02-12 Thread Matt Kuiper
Otis,

Thanks for your reply.  I see your point about too many shards and search 
efficiency.  I also agree that I need to get a better handle on customer 
requirements and expected loads.  

Initially I figured that with the shard splitting option, I would need to 
double my Solr nodes every time I split (as I would want to split every shard 
within the collection).  Where actually only the number of shards would double, 
and then I would have the opportunity to rebalance the shards over the existing 
Solr nodes plus a number of new nodes that make sense at the time.  This may be 
preferable to defining many micro shards up front.

The time-base collections may be an option for this project.  I am not familiar 
with query routing, can you point me to any documentation on how this might be 
implemented?

Thanks,
Matt

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] 
Sent: Wednesday, February 11, 2015 9:13 PM
To: solr-user@lucene.apache.org
Subject: Re: How to make SolrCloud more elastic

Hi Matt,

You could create extra shards up front, but if your queries are fanned out to 
all of them, you can run into situations where there are too many concurrent 
queries per node causing lots of content switching and ultimately being less 
efficient than if you had fewer shards.  So while this is an approach to take, 
I'd personally first try to run tests to see how much a single node can handle 
in terms of volume, expected query rates, and target latency, and then use 
monitoring/alerting/whatever-helps tools to keep an eye on the cluster so that 
when you start approaching the target limits you are ready with additional 
nodes and shard splitting if needed.

Of course, if your data and queries are such that newer documents are queries   
more, you should look into time-based collections... and if your queries can 
only query a subset of data you should look into query routing.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr  
Elasticsearch Support * http://sematext.com/


On Wed, Feb 11, 2015 at 3:32 PM, Matt Kuiper matt.kui...@issinc.com wrote:

 I am starting a new project and one of the requirements is that Solr 
 must scale to handle increasing load (both search performance and index size).

 My understanding is that one way to address search performance is by 
 adding more replicas.

 I am more concerned about handling a growing index size.  I have 
 already been given some good input on this topic and am considering a 
 shard splitting approach, but am more focused on a rebalancing 
 approach that includes defining many shards up front and then moving 
 these existing shards on to new Solr servers as needed.  Plan to 
 experiment with this approach first.

 Before I got too deep, I wondered if anyone has any tips or warnings 
 on these approaches, or has scaled Solr in a different manner.

 Thanks,
 Matt



How to make SolrCloud more elastic

2015-02-11 Thread Matt Kuiper
I am starting a new project and one of the requirements is that Solr must scale 
to handle increasing load (both search performance and index size).

My understanding is that one way to address search performance is by adding 
more replicas.

I am more concerned about handling a growing index size.  I have already been 
given some good input on this topic and am considering a shard splitting 
approach, but am more focused on a rebalancing approach that includes defining 
many shards up front and then moving these existing shards on to new Solr 
servers as needed.  Plan to experiment with this approach first.

Before I got too deep, I wondered if anyone has any tips or warnings on these 
approaches, or has scaled Solr in a different manner.

Thanks,
Matt


RE: 1 Solr many Shards?

2015-02-10 Thread Matt Kuiper
Thanks Anshum!  Very helpful.

Matt Kuiper - Software Engineer
Intelligent Software Solutions
p. 719.452.7721 | matt.kui...@issinc.com 
www.issinc.com | LinkedIn: intelligent-software-solutions

-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Monday, February 09, 2015 4:52 PM
To: solr-user@lucene.apache.org
Subject: Re: 1 Solr many Shards?

Check out the maxShardsPerNode param for CREATE collection here:
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1

It defaults to 1 i.e. on a single node, only a single shard for the collection 
is allowed but you can override to a really high value e.g.
start SolrCloud with a single node, create collection with numShards=5 and
maxShardsPerNode=5 (or more), This will allow you to place multiple shards for 
the same collection on a single node.

Also, the -DzkHost parameter has nothing to do with this. It is the connection 
string for the ZooKeeper server/ensemble and is required to start a SolrCloud 
node but has no impact on what you've asked.


On Mon, Feb 9, 2015 at 2:56 PM, Matt Kuiper matt.kui...@issinc.com wrote:

 My understanding is that a single Solr instance can manage multiple 
 cores/indexes.  I am wondering if a single Solr instance can manage 
 multiple shards (but not necessarily  all) of an index.

 If so, how might this be configured and the Solr instance started?  I 
 am familiar with starting a Solr server within a Solr Cloud that 
 handles a single shard of an index by specifying the DzkHost parameter.

 Thanks,

 Matt Kuiper






--
Anshum Gupta
http://about.me/anshumgupta


Solr on Tomcat

2015-02-10 Thread Matt Kuiper
I am starting to look in to Solr 5.0.  I have been running Solr 4.* on Tomcat.  
 I was surprised to find the following notice on 
https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+Tomcat   
(Marked as Unreleased)

 Beginning with Solr 5.0, Support for deploying Solr as a WAR in servlet 
containers like Tomcat is no longer supported.

I want to verify that it is true that Solr 5.0 will not be able to run on 
Tomcat, and confirm that the recommended way to deploy Solr 5.0 is as a Linux 
service.

Thanks,
Matt


RE: Solr on Tomcat

2015-02-10 Thread Matt Kuiper
Thanks for all the responses.  I am planning a new project, and considering 
deployment options at this time.  It's helpful to see where Solr is headed.

Thanks,

Matt Kuiper 

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Tuesday, February 10, 2015 10:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr on Tomcat

On 2/10/2015 9:48 AM, Matt Kuiper wrote:
 I am starting to look in to Solr 5.0.  I have been running Solr 4.* on 
 Tomcat.   I was surprised to find the following notice on 
 https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+Tomcat   
 (Marked as Unreleased)

  Beginning with Solr 5.0, Support for deploying Solr as a WAR in servlet 
 containers like Tomcat is no longer supported.

 I want to verify that it is true that Solr 5.0 will not be able to run on 
 Tomcat, and confirm that the recommended way to deploy Solr 5.0 is as a Linux 
 service.

Solr will eventually (hopefully soon) be entirely its own application. 
The documentation you have seen in the reference guide is there to prepare 
users for this eventuality.

Right now we are in a transition period.  We have built scripts for controlling 
the start and stop of the example server installation. 
Under the covers, Solr is still a web application contained in a war and the 
example server still runs an unmodified copy of jetty.  Down the road, when 
Solr will becomes a completely standalone application, we will merely have to 
modify the script wrapper to use it, and the user may not even notice the 
change.

With 5.0, if you want to run in tomcat, you will be able to find the war in the 
download's server/webapps directory and use it just like you do now ... but we 
will be encouraging people to NOT do this, because eventually it will be 
completely unsupported.

Thanks,
Shawn



1 Solr many Shards?

2015-02-09 Thread Matt Kuiper
My understanding is that a single Solr instance can manage multiple 
cores/indexes.  I am wondering if a single Solr instance can manage multiple 
shards (but not necessarily  all) of an index.

If so, how might this be configured and the Solr instance started?  I am 
familiar with starting a Solr server within a Solr Cloud that handles a single 
shard of an index by specifying the DzkHost parameter.

Thanks,

Matt Kuiper





RE: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread Matt Kuiper (Springblox)
Based on your solrconfig.xml settings for the filter and queryResult caches, I 
believe Chris's initial guess is correct.  After a commit, there is likely 
plenty of time spent warming these caches due to the significantly high 
autowarm counts.

filterCache class=solr.FastLRUCache
size=16384
initialSize=4096
autowarmCount=4096/
 
queryResultCache class=solr.FastLRUCache
size=8192
initialSize=8192
autowarmCount=2048/

Suggest you try setting the autowarmcount very low or to zero, and then testing 
to confirm the  problem.

You might want to monitor if any JVM garbage collections are occurring during 
this time, and causing system pauses.  With such large caches (nominally stored 
in Old Gen) you may be setting yourself up for GCs that take a significant 
amount of time and thus add to your delay.

Matt


-Original Message-
From: cwhit [mailto:cwhi...@solinkcorp.com] 
Sent: Tuesday, August 12, 2014 11:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Updates to index not available immediately as index scales, even 
with autoSoftCommit at 1 second

Immediately after triggering the update, this is what is in the logs:

/2014-08-12 12:58:48,774 | [71] | 153414367 [qtp2038499066-4772] INFO 
org.apache.solr.update.processor.LogUpdateProcessor  – [collection1] 
webapp=/solr path=/update params={wt=json} {add=[52627624 
(1476251068652322816)]} 0 34

2014-08-12 12:58:49,773 | [71] | 153415369 [commitScheduler-7-thread-1] INFO 
org.apache.solr.update.UpdateHandler  – start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}

2014-08-12 12:58:49,862 | [71] | 153415459 [commitScheduler-7-thread-1] INFO 
org.apache.solr.search.SolrIndexSearcher  – Opening Searcher@65c48c06 main

2014-08-12 12:58:49,874 | [71] | 153415472 [commitScheduler-7-thread-1] INFO 
org.apache.solr.update.UpdateHandler  – end_commit_flush/

The end_commit_flush leads me to believe that the soft commit has completed, 
but perhaps that thought is wrong.  There are no other logs for a while, until 


/
2014-08-12 13:03:49,556 | [71] | 153715147 [commitScheduler-6-thread-1] INFO 
org.apache.solr.update.UpdateHandler  – start 
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

2014-08-12 13:03:49,805 | [71] | 153715402 [commitScheduler-6-thread-1] INFO 
org.apache.solr.core.SolrCore  – SolrDeletionPolicy.onCommit: commits: num=2

2014-08-12 13:03:49,805 | [71] |
commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@E:\Program
Files (x86)\SolrLive\SolrFiles\Solr\service\solr\data\index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@1fac1a3c;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2we,generation=3758}

2014-08-12 13:03:49,805 | [71] |
commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@E:\Program
Files (x86)\SolrLive\SolrFiles\Solr\service\solr\data\index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@1fac1a3c;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2wf,generation=3759}

2014-08-12 13:03:49,805 | [34] | 153715403 [commitScheduler-6-thread-1] INFO 
org.apache.solr.core.SolrCore  – newest commit generation = 3759

2014-08-12 13:03:49,818 | [34] | 153715415 [commitScheduler-6-thread-1] INFO 
org.apache.solr.update.UpdateHandler  – end_commit_flush / At this point, the 
update is still not present...


/2014-08-12 13:11:45,279 | [81] | 154190876 [searcherExecutor-4-thread-1]
INFO  org.apache.solr.core.SolrCore  – QuerySenderListener sending requests
to Searcher@65c48c06 main{StandardDirectoryReader(segments_2we:82217:nrt
_qkc(4.6):C8161558/879724:delGen=275 _sra(4.6):C2943436/247953:delGen=51
_r2w(4.6):C1149753/18376:delGen=55 _rgs(4.6):C1468449/648612:delGen=107
_tdl(4.6):C583431/7873:delGen=94 _svo(4.6):C197286/7:delGen=5
_t4d(4.6):C247031/2928:delGen=36 _tkf(4.6):C111429/761:delGen=23
_tch(4.6):C6014/81:delGen=22 _tk5(4.6):C3907/242:delGen=21
_tjv(4.6):C3492/119:delGen=13 _thd(4.6):C5014/241:delGen=24
_tdh(4.6):C5375/437:delGen=30 _tj1(4.6):C5989/15:delGen=6
_tkq(4.6):C1749/36:delGen=6 _tmj(4.6):C961/1:delGen=1
_tlm(4.6):C714/9:delGen=5 _tm6(4.6):C2616 _tlx(4.6):C1105/273:delGen=3
_tly(4.6):C5/2:delGen=1 _tm2(4.6):C1 _tm4(4.6):C1 _tmb(4.6):C1 _tmk(4.6):C5
_tml(4.6):C12 _tmm(4.6):C1 _tmn(4.6):C2/1:delGen=1 _tmo(4.6):C1 _tmp(4.6):C1
_tmr(4.6):C1 _tms(4.6):C1)}
2014-08-12 13:11:45,280 | [81] | 154190877 [searcherExecutor-4-thread-1]
INFO  org.apache.solr.core.SolrCore  – QuerySenderListener done.

2014-08-12 13:11:45,280 | [81] | 154190877 [searcherExecutor-4-thread-1]
INFO  org.apache.solr.handler.component.SpellCheckComponent  – Building
spell index for spell checker: suggest

2014-08-12 13:11:45,280 | [81] | 154190877 [searcherExecutor-4-thread-1]
INFO  org.apache.solr.spelling.suggest.Suggester  – build()/

Still no 

RE: java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2014-07-25 Thread Matt Kuiper (Springblox)
You might consider looking at your internal Solr cache configuration 
(solrconfig.xml).  These caches occupy heap space, and from my understanding do 
not overflow to disk.  So if there is not enough heap memory to support the 
caches an OOM error will be thrown.

I also believe these caches live in Old Gen.  So you might consider decreasing 
your CMSInitiatingOccupancyFraction to trigger a GC sooner.

Based on your description below every 20,000 documents your caches will be 
invalidated and rebuilt as part of a commit.  So a GC that occurs sooner may 
help free the memory of the old caches.  

Matt

-Original Message-
From: Ameya Aware [mailto:ameya.aw...@gmail.com] 
Sent: Friday, July 25, 2014 9:22 AM
To: solr-user@lucene.apache.org
Subject: java.lang.OutOfMemoryError: Requested array size exceeds VM limit

Hi,

I am in process of indexing lot of documents but after around 9 documents i 
am getting below error:

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

I am passing below parameters with Solr :

java -Xms6144m -Xmx6144m -XX:MaxPermSize=512m -Dcom.sun.management.jmxremote 
-XX:+UseParNewGC -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 
-XX:+CMSIncrementalMode -XX:+CMSParallelRemarkEnabled 
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:ConcGCThreads=6
-XX:ParallelGCThreads=6 -jar start.jar


Also, i am Auto-committing after 2 documents.


I searched on google for this but could not get any specific answer.


Can anybody help with this?


Thanks,
Ameya


RE: To warm the whole cache of Solr other than the only autowarmcount

2014-07-24 Thread Matt Kuiper (Springblox)
I don't believe this would work.  My understanding (please correct if I have 
this wrong) is that the underlying Lucene document ids have a potential to 
change and so when a newSearcher is created the caches must be regenerated and 
not copied.

Matt

-Original Message-
From: YouPeng Yang [mailto:yypvsxf19870...@gmail.com] 
Sent: Thursday, July 24, 2014 10:26 AM
To: solr-user@lucene.apache.org
Subject: To warm the whole cache of Solr other than the only autowarmcount

Hi

   I think it is wonderful to have caches autowarmed when commit or soft commit 
happens. However ,If I want to warm the whole cache other than the only 
autowarmcount,the default the auto warming operation will take long long ~~long 
time.So it comes up with that maybe it good idea  to just change the reference 
of the caches of the newsearcher with the caches of the oldsearcher. It will 
increase the time of the autowarming,also increase the query time of NRT.
  It is just not a mature idea.I am pust this idea,and hope to get more hints 
or help to make more cleat about the idea.



regards


RE: Cache response time

2014-06-04 Thread Matt Kuiper
I have not come across one.  Is your question directed to the queryResultCache? 
 

My understanding is that the queryResultCache is the only cache that contains 
full query results that could be used to compare against non-cached results 
times.  I believe the other caches can participate in speeding up a request for 
different parts of the query (i.e. filterCache can help with the filter query 
portions of a request, and documentCache for the stored fields).  I am learning 
myself, so if someone wants to correct, or clarify, please do.

A possible manually approach to answer your question could be to use a JMX 
monitoring tool to retrieve a timestamp for when the Solr JMX metric hits 
increases for the queryResultCache.  Then you could use the timestamp with the 
logs to find the request time for the request associated with the cache hits 
metric increasing.  

Matt 

-Original Message-
From: Branham, Jeremy [HR] [mailto:jeremy.d.bran...@sprint.com] 
Sent: Wednesday, June 04, 2014 1:33 PM
To: solr-user@lucene.apache.org
Subject: Cache response time

Is there a JMX metric for measuring the cache request time?

I can see the avg request times, but I'm assuming this includes the cache and 
non-cache values.

http://wiki.apache.org/solr/SolrPerformanceFactors






This e-mail may contain Sprint proprietary information intended for the sole 
use of the recipient(s). Any use by others is prohibited. If you are not the 
intended recipient, please contact the sender and delete all copies of the 
message.


SolrCloud zkcli

2014-05-27 Thread Matt Kuiper
Hello,

I am using ZkCLI -cmd upconfig with a reload of each Solr node to update my 
solrconfig within my SolrCloud.  I noticed the linkconfig option at 
https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, and do 
not quite understand where this option is designed to be used.

Can anyone clarify for me?

Thanks,
Matt


RE: SolrCloud zkcli

2014-05-27 Thread Matt Kuiper
Great, thanks

Matt

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Tuesday, May 27, 2014 11:13 AM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud zkcli

On 5/27/2014 10:41 AM, Matt Kuiper wrote:
 I am using ZkCLI -cmd upconfig with a reload of each Solr node to update my 
 solrconfig within my SolrCloud.  I noticed the linkconfig option at 
 https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, and 
 do not quite understand where this option is designed to be used.

If you use linkconfig with a collection that does not exist, you end up with a 
collection *stub* in zookeeper that just contains the configName.  At that 
point you can create the collection manually with CoreAdmin or automatically 
with the Collection Admin, and I would expect that with the latter, you'd be 
able to leave out the collection.configName parameter.

If you use linkconfig with a collection that already exists, you can change 
which configName is linked to that collection.  This is an easy way to swap in 
a dev config on an existing collection.

Thanks,
Shawn



RE: Easises way to insatll solr cloud with tomcat

2014-05-15 Thread Matt Kuiper
Check out http://heliosearch.com/download.html  

This is a distribution of Apache Solr packaged with Tomcat.

I have found it simple to use.

Matt

-Original Message-
From: Aman Tandon [mailto:amantandon...@gmail.com] 
Sent: Monday, May 12, 2014 6:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Easises way to insatll solr cloud with tomcat

Can anybody help me out??

With Regards
Aman Tandon


On Mon, May 12, 2014 at 1:24 PM, Aman Tandon amantandon...@gmail.comwrote:

 Hi,

 I tried to set up solr cloud with jetty which works fine. But in our 
 production environment we uses tomcat so i need to set up the solr 
 cloud with the tomcat. So please help me out to how to setup solr 
 cloud with tomcat on single machine.

 Thanks in advance.

 With Regards
 Aman Tandon



RE: Easises way to insatll solr cloud with tomcat

2014-05-13 Thread Matt Kuiper (Springblox)
Check out http://heliosearch.com/download.html 

It is a distribution of Apache Solr packaged with Tomcat.

I have found it simple to use.

Matt

-Original Message-
From: Aman Tandon [mailto:amantandon...@gmail.com] 
Sent: Monday, May 12, 2014 6:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Easises way to insatll solr cloud with tomcat

Can anybody help me out??

With Regards
Aman Tandon


On Mon, May 12, 2014 at 1:24 PM, Aman Tandon amantandon...@gmail.comwrote:

 Hi,

 I tried to set up solr cloud with jetty which works fine. But in our 
 production environment we uses tomcat so i need to set up the solr 
 cloud with the tomcat. So please help me out to how to setup solr 
 cloud with tomcat on single machine.

 Thanks in advance.

 With Regards
 Aman Tandon



RE: cache warming questions

2014-04-17 Thread Matt Kuiper
Ok,  that makes sense.

Thanks again,
Matt

Matt Kuiper - Software Engineer
Intelligent Software Solutions
p. 719.452.7721 | matt.kui...@issinc.com 
www.issinc.com | LinkedIn: intelligent-software-solutions

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Thursday, April 17, 2014 9:26 AM
To: solr-user@lucene.apache.org
Subject: Re: cache warming questions

Don't go overboard warming here, you often hit diminishing returns very 
quickly. For instance, if the size is 512 you might set your autowarm count to 
16 and get the most bang for your buck. Beyond some (usually small) number, the 
additional work you put in to warming is wasted. This is especially true if 
your autocommit (soft, or hard with
openSearcher=true) is short.

So while you're correct in your sizing bit, practically it's rarely that 
complicated since the autowarm count is usually so much smaller than the size 
that there's no danger of swapping them out. YMMV of course.

Best,
Erick

On Wed, Apr 16, 2014 at 10:33 AM, Matt Kuiper matt.kui...@issinc.com wrote:
 Thanks Erick, this is helpful information!

 So it sounds like, at minimum the cache size (at least for filterCache and 
 queryResultCache) should be the sum of the autowarmCount for that cache and 
 the number of queries defined for the newSearcher listener.  Otherwise some 
 items in the caches will be evicted right away.

 Matt

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, April 15, 2014 5:21 PM
 To: solr-user@lucene.apache.org
 Subject: Re: cache warming questions

 bq: What does it mean that items will be regenerated or prepopulated from the 
 current searcher's cache...

 You're right, the values aren't cached. They can't be since the internal 
 Lucene document id is used to identify docs, and due to merging the internal 
 ID may bear no relation to the old internal ID for a particular document.

 I find it useful to think of Solr's caches as a  map where the key is the 
 query and the value is some representation of the found documents. The 
 details of the value don't matter, so I'll skip them.

 What matters is the key. Consider the filter cache. You put something like 
 fq=price:[0 TO 100] on a URL. Solr then uses the fq  clause as the key to 
 the filterCache.

 Here's the sneaky bit. When you specify an autowarm count of N for the 
 filterCache, when a new searcher is opened the first N keys from the map are 
 re-executed in the new searcher's context and the results put into the new 
 searcher's filterCache.

 bq:  ...how does auto warming and explicit warming work together?

 They're orthogonal. IOW, the autowarming for each cache is executed as well 
 as the newSearcher static warming queries. Use the static queries to do 
 things like fill the sort caches etc.

 Incidentally, this bears on why there's a firstSearcher and newSearcher. 
 The newSearcher queries are run in addition to the cache autowarms. 
 firstSearcher static queries are only run when a Solr server is started the 
 first time, and there are no cache entries to autowarm. So the firstSearcher 
 queries might be quite a bit more complex than newSearcher queries.

 HTH,
 Erick

 On Tue, Apr 15, 2014 at 1:55 PM, Matt Kuiper matt.kui...@issinc.com wrote:
 Hello,

 I have a few questions regarding how Solr caches are warmed.

 My understanding is that there are two ways to warm internal Solr caches 
 (only one way for document cache and lucene FieldCache):

 Auto warming - occurs when there is a current searcher handling requests and 
 new searcher is being prepared.  When a new searcher is opened, its caches 
 may be prepopulated or autowarmed with cached object from caches in the 
 old searcher. autowarmCount is the number of cached items that will be 
 regenerated in the new searcher.
 http://wiki.apache.org/solr/SolrCaching#autowarmCount

 Explicit warming - where the static warming queries specified in 
 Solrconfig.xml for newSearcher and firstSearcher listeners are executed when 
 a new searcher is being prepared.

 What does it mean that items will be regenerated or prepopulated from the 
 current searcher's cache to the new searcher's cache?  I doubt it means 
 copy, as the index has likely changed with a commit and possibly invalidated 
 some contents of the cache.  Are the queries, or filters, that define the 
 contents of the current caches re-executed for the new searcher's caches?

 For the case where auto warming is configured, a current searcher is active, 
 and static warming queries are defined how does auto warming and explicit 
 warming work together? Or do they?  Is only one type of warming activated to 
 fill the caches?

 Thanks,
 Matt


RE: cache warming questions

2014-04-16 Thread Matt Kuiper
Thanks Erick, this is helpful information!

So it sounds like, at minimum the cache size (at least for filterCache and 
queryResultCache) should be the sum of the autowarmCount for that cache and the 
number of queries defined for the newSearcher listener.  Otherwise some items 
in the caches will be evicted right away.

Matt 

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Tuesday, April 15, 2014 5:21 PM
To: solr-user@lucene.apache.org
Subject: Re: cache warming questions

bq: What does it mean that items will be regenerated or prepopulated from the 
current searcher's cache...

You're right, the values aren't cached. They can't be since the internal Lucene 
document id is used to identify docs, and due to merging the internal ID may 
bear no relation to the old internal ID for a particular document.

I find it useful to think of Solr's caches as a  map where the key is the 
query and the value is some representation of the found documents. The 
details of the value don't matter, so I'll skip them.

What matters is the key. Consider the filter cache. You put something like 
fq=price:[0 TO 100] on a URL. Solr then uses the fq  clause as the key to the 
filterCache.

Here's the sneaky bit. When you specify an autowarm count of N for the 
filterCache, when a new searcher is opened the first N keys from the map are 
re-executed in the new searcher's context and the results put into the new 
searcher's filterCache.

bq:  ...how does auto warming and explicit warming work together?

They're orthogonal. IOW, the autowarming for each cache is executed as well as 
the newSearcher static warming queries. Use the static queries to do things 
like fill the sort caches etc.

Incidentally, this bears on why there's a firstSearcher and newSearcher. 
The newSearcher queries are run in addition to the cache autowarms. 
firstSearcher static queries are only run when a Solr server is started the 
first time, and there are no cache entries to autowarm. So the firstSearcher 
queries might be quite a bit more complex than newSearcher queries.

HTH,
Erick

On Tue, Apr 15, 2014 at 1:55 PM, Matt Kuiper matt.kui...@issinc.com wrote:
 Hello,

 I have a few questions regarding how Solr caches are warmed.

 My understanding is that there are two ways to warm internal Solr caches 
 (only one way for document cache and lucene FieldCache):

 Auto warming - occurs when there is a current searcher handling requests and 
 new searcher is being prepared.  When a new searcher is opened, its caches 
 may be prepopulated or autowarmed with cached object from caches in the old 
 searcher. autowarmCount is the number of cached items that will be 
 regenerated in the new searcher.
 http://wiki.apache.org/solr/SolrCaching#autowarmCount

 Explicit warming - where the static warming queries specified in 
 Solrconfig.xml for newSearcher and firstSearcher listeners are executed when 
 a new searcher is being prepared.

 What does it mean that items will be regenerated or prepopulated from the 
 current searcher's cache to the new searcher's cache?  I doubt it means copy, 
 as the index has likely changed with a commit and possibly invalidated some 
 contents of the cache.  Are the queries, or filters, that define the contents 
 of the current caches re-executed for the new searcher's caches?

 For the case where auto warming is configured, a current searcher is active, 
 and static warming queries are defined how does auto warming and explicit 
 warming work together? Or do they?  Is only one type of warming activated to 
 fill the caches?

 Thanks,
 Matt


cache warming questions

2014-04-15 Thread Matt Kuiper
Hello,

I have a few questions regarding how Solr caches are warmed.

My understanding is that there are two ways to warm internal Solr caches (only 
one way for document cache and lucene FieldCache):

Auto warming - occurs when there is a current searcher handling requests and 
new searcher is being prepared.  When a new searcher is opened, its caches may 
be prepopulated or autowarmed with cached object from caches in the old 
searcher. autowarmCount is the number of cached items that will be regenerated 
in the new searcher.http://wiki.apache.org/solr/SolrCaching#autowarmCount

Explicit warming - where the static warming queries specified in Solrconfig.xml 
for newSearcher and firstSearcher listeners are executed when a new searcher is 
being prepared.

What does it mean that items will be regenerated or prepopulated from the 
current searcher's cache to the new searcher's cache?  I doubt it means copy, 
as the index has likely changed with a commit and possibly invalidated some 
contents of the cache.  Are the queries, or filters, that define the contents 
of the current caches re-executed for the new searcher's caches?

For the case where auto warming is configured, a current searcher is active, 
and static warming queries are defined how does auto warming and explicit 
warming work together? Or do they?  Is only one type of warming activated to 
fill the caches?

Thanks,
Matt


RE: Solr query with mandatory values

2012-05-09 Thread Matt Kuiper
Yes.  

See http://wiki.apache.org/solr/SolrQuerySyntax  - The standard Solr Query 
Parser syntax is a superset of the Lucene Query Parser syntax.
Which links to http://lucene.apache.org/core/3_6_0/queryparsersyntax.html 

Note - Based on the info on these pages I believe the + symbol is to be 
placed just before the mandatory value, not before the field name in the query.

Matt Kuiper
Intelligent Software Solutions

-Original Message-
From: G.Long [mailto:jde...@gmail.com] 
Sent: Wednesday, May 09, 2012 10:45 AM
To: solr-user@lucene.apache.org
Subject: Solr query with mandatory values

Hi :)

I remember that in a Lucene query, there is something like mandatory values. I 
just have to add a + symbol in front of the mandatory parameter, like: 
+myField:my value

I was wondering if there was something similar in Solr queries? Or is this 
behaviour activated by default?

Gary