[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-13 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531054#comment-13531054
 ] 

Markus Jelsma commented on SOLR-4144:
-

I think this is resolved now, i don't see old index directories and the heap 
issue is definitely gone! Great work!

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-13 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531064#comment-13531064
 ] 

Mark Miller commented on SOLR-4144:
---

It's great to have someone so actively engaging with 5x and reporting issues :)

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-13 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531079#comment-13531079
 ] 

Commit Tag Bot commented on SOLR-4144:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1421331

SOLR-4144: CHANGES entry


 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-13 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531129#comment-13531129
 ] 

Commit Tag Bot commented on SOLR-4144:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1421334

SOLR-4144: CHANGES entry


 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Assignee: Yonik Seeley
Priority: Critical
 Fix For: 4.1, 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-11 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528917#comment-13528917
 ] 

Markus Jelsma commented on SOLR-4144:
-

Ok, the heap thing seems to be resolved now. I could not replicate the error 
after deploying the patch but i could reproduce it reliably before. However, 
i'm starting to see stale index directories again, similar to a problem that 
was fixed earlier. Even on a clean cluster with empty data directories we see 
stale index.* directories some time after we start indexing data.

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529054#comment-13529054
 ] 

Mark Miller commented on SOLR-4144:
---

Prob due a file name misspelling yonik spotted yesterday...

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_c/data
 2012-12-04 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529062#comment-13529062
 ] 

Mark Miller commented on SOLR-4144:
---

I committed a fix as part of SOLR-3911

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_c/data
 2012-12-04 15:05:35,085 WARN 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-10 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528182#comment-13528182
 ] 

Yonik Seeley commented on SOLR-4144:


I bet this could be due to NRTCachingDirectory?  It makes the decision to cache 
a file or not up-front and can't change when it's part-way through the file.

If there's no mergeInfo or flushInfo in the context (and the file isn't the 
segments file) then it will chose to cache the file.
We need to pass something (like flushInfo) that will convince it not to cache.  
I'll work up a patch...

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-10 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528396#comment-13528396
 ] 

Markus Jelsma commented on SOLR-4144:
-

Yonik, we have a cluster standing by on which i can reproduce the problem. I'll 
reconfirm this problem and then deploy the wars with your patch tomorrow and 
report back.

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528533#comment-13528533
 ] 

Mark Miller commented on SOLR-4144:
---

+1! Patch looks good.

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_c/data
 2012-12-04 15:05:35,085 WARN 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-10 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528632#comment-13528632
 ] 

Commit Tag Bot commented on SOLR-4144:
--

[trunk commit] Yonik Seeley
http://svn.apache.org/viewvc?view=revisionrevision=1419980

SOLR-4144: don't cache replicated index files


 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0

 Attachments: SOLR-4144.patch


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 

[jira] [Commented] (SOLR-4144) SolrCloud replication high heap consumption

2012-12-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510105#comment-13510105
 ] 

Mark Miller commented on SOLR-4144:
---

Thanks for the report! I will try and investigate this soon - I'm on an 11 inch 
low powered macbook air for the week, so I may not get to it till next week, 
but we will see...

 SolrCloud replication high heap consumption
 ---

 Key: SOLR-4144
 URL: https://issues.apache.org/jira/browse/SOLR-4144
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 5.0
 Environment: 5.0-SNAPSHOT 1366361:1416494M - markus - 2012-12-03 
 14:09:13
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 5.0


 Recent versions of SolrCloud require a very high heap size vs. older 
 versions. Another cluster of 5.0.0.2012.10.09.19.29.59 (~4GB per core) can 
 restore an empty node without taking a lot of heap (xmx=256m). Recent 
 versions and current trunk fail miserably even with a higher heap (750m). 
 Both clusters have 10 nodes, 10 shards and 2 cores per node. One note to add 
 is that the cluster on which this fails has only about 1.5GB per core due to 
 changing in the Lucene codec such as compression.
 After start up everything goes fine...
 {code}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_c
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.190:8080/solr/shard_b/. 
 core=shard_b
 2012-12-04 15:05:35,013 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://178.21.118.192:8080/solr/shard_c/. 
 core=shard_c
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,014 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,052 INFO [solr.handler.ReplicationHandler] - 
 [RecoveryThread] - : Commits will be reserved for  1
 2012-12-04 15:05:35,053 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:connTimeout=5000socketTimeout=2allowCompression=falsemaxConnections=1maxConnectionsPerHost=1
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,060 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
  No value set for 'pollInterval'. Timer Task not started.
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 48
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Master's generation: 47
 2012-12-04 15:05:35,069 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Slave's generation: 1
 2012-12-04 15:05:35,070 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Starting replication process
 2012-12-04 15:05:35,078 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 235
 2012-12-04 15:05:35,079 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 Number of files in latest index in master: 287
 2012-12-04 15:05:35,084 WARN [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : No lockType configured for 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_c/data/index.20121204150535080
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@57530551; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0) assuming 'simple'
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : return new directory for 
 /opt/solr/cores/shard_c/data/index.20121204150535080 forceNew:false
 2012-12-04 15:05:35,085 INFO [solr.core.CachingDirectoryFactory] -