[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511348#comment-13511348
 ] 

Per Steffensen commented on SOLR-4114:
--------------------------------------

I verified that the removal of my "controlled" instance-dir and data-dir 
OverseerCollectionProcessor.createCollection is ok. I needed to do some 
investigations on how instance-dir and data-dir works. Now I know and can see 
that the "controlled" instance-dir and data-dir was a bad idea. Thanks for 
being so thorough, Mark.

During my investigations of instance-dir and data-dir I came up with an 
additional test for BasicDistributedZkTest.testCollectionAPI, namely to do a 
test making sure that when you have created a lot of collections you will not 
end up with any two (or more) shards using the same index-dir - that was 
actually what I was affraid would happen when you (Mark) removed the 
"controlled" instance- and data-dir. This additional test-part will run very 
fast (200 ms on my local machine), so it will not extend the run-time of the 
test noticeably to include it. Instead of sending a patch I will just explain 
what to do to get this additional testing into BasicDistributedZkTest (this 
description works on 4.0, but I couldnt imagine that it wouldnt on 5.x or 4.x):
* Add this method somewhere in BasicDistributedZkTest
{code}
  private void checkNoTwoShardsUseTheSameIndexDir() throws Exception {
    Map<String, Set<String>> indexDirToShardNamesMap = new HashMap<String, 
Set<String>>();
    
    List<MBeanServer> servers = new LinkedList<MBeanServer>();
    servers.add(ManagementFactory.getPlatformMBeanServer());
    servers.addAll(MBeanServerFactory.findMBeanServer(null));
    for (final MBeanServer server : servers) {
      Set<ObjectName> mbeans = new HashSet<ObjectName>();
      mbeans.addAll(server.queryNames(null, null));
      for (final ObjectName mbean : mbeans) {
        Object value;
        Object indexDir;
        Object name;
        try {
          if (((value = server.getAttribute(mbean, "category")) != null && 
value.toString().equals(Category.CORE.toString())) &&
              ((value = server.getAttribute(mbean, "source")) != null && 
value.toString().contains(SolrCore.class.getSimpleName())) &&
              ((indexDir = server.getAttribute(mbean, "indexDir")) != null) &&
              ((name = server.getAttribute(mbean, "name")) != null)) {
              if (!indexDirToShardNamesMap.containsKey(indexDir.toString())) {
                indexDirToShardNamesMap.put(indexDir.toString(), new 
HashSet<String>());
              }
              
indexDirToShardNamesMap.get(indexDir.toString()).add(name.toString());
          }
        } catch (Exception e) {
          // ignore, just continue - probably a "category" or "source" 
attribute not found
        }
      }
    }
    
    assertTrue("Something is broken in the assert for no shards using the same 
indexDir - probably something was changed in the attributes published in the 
MBean of " + SolrCore.class.getSimpleName(), indexDirToShardNamesMap.size() > 
0);
    for (Entry<String, Set<String>> entry : indexDirToShardNamesMap.entrySet()) 
{
      if (entry.getValue().size() > 1) {
        fail("We have shards using the same indexDir. E.g. shards " + 
entry.getValue().toString() + " all use indexDir " + entry.getKey());
      }
    }
    
  }
{code}
* Add a call to this method (checkNoTwoShardsUseTheSameIndexDir();) at the end 
of BasicDistributedZkTest.testCollectionsAPI
* Add the line "lst.add("indexDir", getIndexDir());" to 
SolrCore.getStatistics() so that index-dir will also be part of the information 
exposed in the MBean of SolrCore

Please consider including the additional test. It scans all SolrCores in the 
system to see if any of them share index-dir. I do the scanning by accessing 
MBean info from SolrCores - the simplest way I could come up with. It means 
that SolrCore will now also expose index-dir through its MBean, but I guess no 
one would have anything against that.

Regards, Per Steffensen
                
> Collection API: Allow multiple shards from one collection on the same Solr 
> server
> ---------------------------------------------------------------------------------
>
>                 Key: SOLR-4114
>                 URL: https://issues.apache.org/jira/browse/SOLR-4114
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Solr 4.0.0 release
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: collection-api, multicore, shard, shard-allocation
>         Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
> SOLR-4114.patch, SOLR-4114_trunk.patch
>
>
> We should support running multiple shards from one collection on the same 
> Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
> (each Solr server running 2 shards).
> Performance tests at our side has shown that this is a good idea, and it is 
> also a good idea for easy elasticity later on - it is much easier to move an 
> entire existing shards from one Solr server to another one that just joined 
> the cluter than it is to split an exsiting shard among the Solr that used to 
> run it and the new Solr.
> See dev mailing list discussion "Multiple shards for one collection on the 
> same Solr server"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to