[ 
https://issues.apache.org/jira/browse/SOLR-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173504#comment-14173504
 ] 

Shalin Shekhar Mangar edited comment on SOLR-6583 at 10/16/14 7:55 AM:
-----------------------------------------------------------------------

Hi [~jsipprell], what you are seeing is probably a different bug because if you 
are bringing up dead nodes then this error shouldn't happen. 

As I said in the description:
bq. This is because the recoverFromLog uses transaction log references that 
were collected at startup and are no longer valid.

If you start a node then the log references collected at startup should be 
valid and the recoverFromLog method should definitely succeed. Which version of 
Solr are you using? How often do you see this error and is it easily 
reproducible?


was (Author: shalinmangar):
Hi [~jsipprell], what you are seeing is probably a different bug because if you 
are bringing up dead nodes then this error shouldn't happen. 

As I said in the description:
bq. This is because the recoverFromLog uses transaction log references that 
were collected at startup and are no longer valid.

If you starting up nodes then the log references collected at startup should be 
valid and the recoverFromLog method should definitely succeed. Which version of 
Solr are you using? How often do you see this error and is it easily 
reproducible?

> Resuming connection with ZooKeeper causes log replay
> ----------------------------------------------------
>
>                 Key: SOLR-6583
>                 URL: https://issues.apache.org/jira/browse/SOLR-6583
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.10.1
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 5.0, Trunk
>
>
> If a node is partitioned from ZooKeeper for an extended period of time then 
> upon resuming connection, the node re-registers itself causing 
> recoverFromLog() method to be executed which fails with the following 
> exception:
> {code}
> 8091124 [Thread-71] ERROR org.apache.solr.update.UpdateLog  – Error 
> inspecting tlog 
> tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0000000000000009869
>  refcount=2}
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
>         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
>         at 
> org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784)
>         at 
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
>         at 
> org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
>         at java.io.InputStream.read(InputStream.java:101)
>         at 
> org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218)
>         at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800)
>         at org.apache.solr.cloud.ZkController.register(ZkController.java:834)
>         at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271)
>         at 
> org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)
> 8091125 [Thread-71] ERROR org.apache.solr.update.UpdateLog  – Error 
> inspecting tlog 
> tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0000000000000009870
>  refcount=2}
> java.nio.channels.ClosedChannelException
>         at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99)
>         at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678)
>         at 
> org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784)
>         at 
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89)
>         at 
> org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125)
>         at java.io.InputStream.read(InputStream.java:101)
>         at 
> org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218)
>         at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800)
>         at org.apache.solr.cloud.ZkController.register(ZkController.java:834)
>         at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271)
>         at 
> org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)
> {code}
> This is because the recoverFromLog uses transaction log references that were 
> collected at startup and are no longer valid.
> We shouldn't even be running recoverFromLog code for ZK re-connect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to