[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709534#comment-15709534
 ] 

Scott Blum commented on SOLR-7282:
--

Thanks Kevin, we'll use that.

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709533#comment-15709533
 ] 

Scott Blum commented on SOLR-7282:
--

OH, it was just added in 6.2

https://issues.apache.org/jira/browse/SOLR-9216

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709514#comment-15709514
 ] 

Shawn Heisey commented on SOLR-7282:


The way that I was always aware of for changing this in cloud mode is zkcli, 
using the linkconfig command.  Followed of course by a collection reload.

I wasn't even really aware of MODIFYCOLLECTION until just now ... the docs do 
say that you can update collection.configName.  SOLR-5132 (which implemented 
MODIFYCOLLECTION) says that the intent was to automatcially a reload in the 
event the configname was modified.  I don't know if the automatic reload was 
implemented.

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-modifycoll


> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Kevin Risden (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709484#comment-15709484
 ] 

Kevin Risden commented on SOLR-7282:


[~dragonsinth] - Not sure if its in the COLLECTIONS API, but there is a 
linkconfig operation on the zkcli.sh script. Look under "Link a collection to a 
configuration set" on this page: 
https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709440#comment-15709440
 ] 

Scott Blum commented on SOLR-7282:
--

Erick, apologies, I'm sick so I'm not explaining this well.

I'm not talking about doing anything contrary to what the user asked for.  I am 
only talking about a memory-saving implementation detail.  If the internal 
object representing a configset is in fact immutable and sharable, then there 
is no user-facing difference as to whether two configsets with identical 
content are internally represented by the same immutable sharable object or two 
different but indistinguishable identical objects.

To answer your other questions we're not on an old solr, we're in solrcloud.  
core.properties doesn't apply the config is in ZK.  The problem is we have one 
configset per collection.  4000 collections, 4000 configsets, all content 
identical.  Does MODIFYCOLLECTION actually allow you to change the configset a 
particular collection points to?  I swear last time I looked at that doc, 
collection.configName wasn't in the list of things you could mutate.

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709391#comment-15709391
 ] 

Erick Erickson commented on SOLR-7282:
--

Scott:

My concern has nothing to do with technical capabilities/correctness and 
everything to do with surprising the users and, as an aside, introducing the 
possibilities for error. Introducing such code to support an old version of 
Solr is even less appealing, this sounds like a local patch if you think it's 
worth it. And I'm a little puzzled. You mention "collections", which have 
always had the notion of configsets. Or are you dealing with stand-alone?

If you're really thinking cores and working with stand-alone, if/when you 
upgrade to a Solr that _does_ respect configsets, you should be able to change 
your core.properties files and add the configSet parameter, see: 
https://cwiki.apache.org/confluence/display/solr/Defining+core.properties and 
point them all at the same configset. Then the normal processing we're talking 
about here should work without deciding to reuse based on hashing rather than 
the user's express intent. Changing 4,000 properties files while inelegant 
seems like a _lot_ less work than coding/debugging/maintaining some kind of 
"you said to use X but we're ignoring that because we know better"

Reusing the same internal object based on identical specifications (i.e. the 
user named a particular configset) seems like a fine idea. Doing something 
other than what the user specified because we think we know better to support 
an edge case that there should be other ways of addressing seems unnecessary.

IMO of course.



> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-30 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709228#comment-15709228
 ] 

Scott Blum commented on SOLR-7282:
--

I mean, content hashing is a pretty ubiquitous technique.  Git is fundamentally 
built on it.

The reason I personally care is that our cluster is so old I'm not even sure if 
configsets were a thing when we started.  So we have 4000 collections with 4000 
identical configurations.

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707531#comment-15707531
 ] 

Erick Erickson commented on SOLR-7282:
--

The idea of sharing collections that are named differently but happen to be 
exactly the same strikes me as added complexity for very little gain. And 
tracking down errors in that world would be...interesting. 

+1 to reusing the same objects if possible for configsets that have the same 
znode (assuming it handles the mutable issue)
-1 to using a configset other than the one specified by the user when the 
collection was created because we think it's the same based on the hash.



> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-29 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707512#comment-15707512
 ] 

Scott Blum commented on SOLR-7282:
--

Keep a separate map of znode + version -> content hash?

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-29 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707484#comment-15707484
 ] 

Noble Paul commented on SOLR-7282:
--

How do we compute the hash w/o downloading the content? Doesn't it mean that we 
have to fetch the whole file every time? The only savings that we get, in that 
case, is the cost of parsing

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-29 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707482#comment-15707482
 ] 

Noble Paul commented on SOLR-7282:
--

How do we compute the hash w/o downloading the content? Doesn't it mean that we 
have to fetch the whole file every time? The only savings that we get, in that 
case, is the cost of parsing

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-11-29 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707159#comment-15707159
 ] 

Scott Blum commented on SOLR-7282:
--

Could we cache by content hash?  Then even collections that physically don't 
share the same configset but whose contents are identical could share config.

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-07-13 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374572#comment-15374572
 ] 

Noble Paul commented on SOLR-7282:
--

I have opened a ticket more comprehensively address these requirements SOLR-9273

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-07-12 Thread damien kamerman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374179#comment-15374179
 ] 

damien kamerman commented on SOLR-7282:
---

What about caching the IndexSchema object only when the schemaFactory is the 
(immutable) ClassicIndexSchemaFactory?

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2016-06-22 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344197#comment-15344197
 ] 

Noble Paul commented on SOLR-7282:
--

Looked at the patch. It's broken. 

The way you cache things assume that schema is immutable. it is not. 
{{IndexSchema}} can change all the time. So, the cache must be aware of the 
znode version from where the schema is loaded. 

There are two ways we can cache.
# Cache the schema file content and avoid reloading from ZK. This is low impact 
and cheaper to implement but less beneficial. The cost of parsing the XML is 
high. If we can pre-parse the XML and store them as Java objects and share this 
java object it is faster.
# Cache the {{IndexSchema}} object. This would mean, we need to carefully look  
at the lifecycle of components in {{IndexSchema}}

 

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7282) Cache config or index schema objects by configset and share them across cores

2015-10-01 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940653#comment-14940653
 ] 

Shawn Heisey commented on SOLR-7282:


Something to think about, and this might affect the existing non-cloud feature 
"shareSchema" too:

If the actual Java object is shared, we need to be careful about behavior on 
reload.  Users expect that updating the config in zookeeper will not cause 
immediate usage of that config, that they will have to do a reload before it 
will take effect.  Say that one machine, updated with a Solr version that has a 
fix for this issue, hosts ten cores that are replicas for various shards from 
Collection A.  Those ten cores are all sharing Java objects for config and 
schema.

When the user updates the config in zookeeper that for collection A, here's how 
I think everything should play out:

 * The objects that those ten cores are sharing will NOT be updated by the 
change to zookeeper -- they will still contain info from the config before 
modification.
 * When one of those cores is reloaded, it should create new objects holding 
the new information in zookeeper, but the other cores for that collection will 
still have the old objects with the old config.
 * When a second and subsequent cores are reloaded, they should share the 
objects created during the first core reload.

The coding challenge will be to figure out when the existing set of config 
objects (probably at the CoreContainer level) need to be thrown away (but still 
held by existing cores) and replaced with a new set of objects.  I'm guessing 
that the config znode in zookeeper contains some info that will only change 
when the config is updated, that info could probably be used.

> Cache config or index schema objects by configset and share them across cores
> -
>
> Key: SOLR-7282
> URL: https://issues.apache.org/jira/browse/SOLR-7282
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
> Fix For: 5.2, Trunk
>
> Attachments: SOLR-7282.patch
>
>
> Sharing schema and config objects has been known to improve startup 
> performance when a large number of cores are on the same box (See 
> http://wiki.apache.org/solr/LotsOfCores).Damien also saw improvements to 
> cluster startup speed upon caching the index schema in SOLR-7191.
> Now that SolrCloud configuration is based on config sets in ZK, we should 
> explore how we can minimize config/schema parsing for each core in a way that 
> is compatible with the recent/planned changes in the config and schema APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org