[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531054#comment-13531054 ] Markus Jelsma commented on SOLR-4144: - I think this is resolved now, i don't see old index directories and the heap issue is definitely gone! Great work! SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - :
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531064#comment-13531064 ] Mark Miller commented on SOLR-4144: --- It's great to have someone so actively engaging with 5x and reporting issues :) SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : Releasing
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531079#comment-13531079 ] Commit Tag Bot commented on SOLR-4144: -- [trunk commit] Mark Robert Miller http://svn.apache.org/viewvc?view=revisionrevision=1421331 SOLR-4144: CHANGES entry SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] -
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531129#comment-13531129 ] Commit Tag Bot commented on SOLR-4144: -- [branch_4x commit] Mark Robert Miller http://svn.apache.org/viewvc?view=revisionrevision=1421334 SOLR-4144: CHANGES entry SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Assignee: Yonik Seeley Priority: Critical Fix For: 4.1, 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528917#comment-13528917 ] Markus Jelsma commented on SOLR-4144: - Ok, the heap thing seems to be resolved now. I could not replicate the error after deploying the patch but i could reproduce it reliably before. However, i'm starting to see stale index directories again, similar to a problem that was fixed earlier. Even on a clean cluster with empty data directories we see stale index.* directories some time after we start indexing data. SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529054#comment-13529054 ] Mark Miller commented on SOLR-4144: --- Prob due a file name misspelling yonik spotted yesterday... SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_c/data 2012-12-04
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529062#comment-13529062 ] Mark Miller commented on SOLR-4144: --- I committed a fix as part of SOLR-3911 SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_c/data 2012-12-04 15:05:35,085 WARN
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528182#comment-13528182 ] Yonik Seeley commented on SOLR-4144: I bet this could be due to NRTCachingDirectory? It makes the decision to cache a file or not up-front and can't change when it's part-way through the file. If there's no mergeInfo or flushInfo in the context (and the file isn't the segments file) then it will chose to cache the file. We need to pass something (like flushInfo) that will convince it not to cache. I'll work up a patch... SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528396#comment-13528396 ] Markus Jelsma commented on SOLR-4144: - Yonik, we have a cluster standing by on which i can reproduce the problem. I'll reconfirm this problem and then deploy the wars with your patch tomorrow and report back. SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528533#comment-13528533 ] Mark Miller commented on SOLR-4144: --- +1! Patch looks good. SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_c/data 2012-12-04 15:05:35,085 WARN
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528632#comment-13528632 ] Commit Tag Bot commented on SOLR-4144: -- [trunk commit] Yonik Seeley http://svn.apache.org/viewvc?view=revisionrevision=1419980 SOLR-4144: don't cache replicated index files SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Attachments: SOLR-4144.patch Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] -
[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption
[ https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510105#comment-13510105 ] Mark Miller commented on SOLR-4144: --- Thanks for the report! I will try and investigate this soon - I'm on an 11 inch low powered macbook air for the week, so I may not get to it till next week, but we will see... SolrCloud replication high heap consumption --- Key: SOLR-4144 URL: https://issues.apache.org/jira/browse/SOLR-4144 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Affects Versions: 5.0 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 14:09:13 Reporter: Markus Jelsma Priority: Critical Fix For: 5.0 Recent versions of SolrCloud require a very high heap size vs. older versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can restore an empty node without taking a lot of heap (xmx=256m). Recent versions and current trunk fail miserably even with a higher heap (750m). Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add is that the cluster on which this fails has only about 1.5GB per core due to changing in the Lucene codec such as compression. After start up everything goes fine... {code} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_c 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Begin buffering updates. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null} 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. core=shard_b 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. core=shard_c 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:maxConnections=128maxConnectionsPerHost=32followRedirects=false 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - [RecoveryThread] - : Commits will be reserved for 1 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - : Creating new http client, config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : No value set for 'pollInterval'. Timer Task not started. 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 48 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Master's generation: 47 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Slave's generation: 1 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Starting replication process 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 235 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : Number of files in latest index in master: 287 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : No lockType configured for NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080 lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple' 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - [RecoveryThread] - : return new directory for /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] -