[
https://issues.apache.org/jira/browse/CASSANDRA-16538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309131#comment-17309131
]
Yolanda Tang commented on CASSANDRA-16538:
------------------------------------------
Hi [~brandon.williams],
Sorry to raise the issue here, please kindly close it.
I've raised this on the Github issue page.
Thanks for replying anyway.
> Cannot run restore for a list of Cassandra nodes
> ------------------------------------------------
>
> Key: CASSANDRA-16538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16538
> Project: Cassandra
> Issue Type: Bug
> Reporter: Yolanda Tang
> Priority: Normal
>
> Hi,
>
> When switching to use Cassandra medus to fulfill our work for node data
> restore, we encountered some issues.
> When using pssh remotely we are getting timeout issue, when trying the
> command on one node of Cassandra, we get
>
> {code:java}
> pssh -H XXXX medusa -vvv restore-node --in-place --no-verify --backup-name
> 2021031803 --temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
> [1] 06:52:08 [FAILURE] sha8392 Timed out, Killed by signal 9
> When further looking into the timeout issue, we get logs as
> [2021-03-25 02:23:50,113] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443
> "GET /XX/XX/10.44.XX.XX/2021031803/meta/schema.cql?Version=2006-03-01
> HTTP/1.1" 200 24005[2021-03-25 02:23:50,114] DEBUG: [Storage] Getting object
> sre_dev_cass_sha/10.44.79.15/2021031803/meta/tokenmap.json
> [2021-03-25 02:23:50,151] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443
> "HEAD /XX HTTP/1.1" 200 0[2021-03-25 02:23:50,201] DEBUG:
> https://s3.cn-north-1.amazonaws.com.cn:443 "HEAD
> /XX/XX/10.44.79.15/2021031803/meta/tokenmap.json HTTP/1.1" 200 0[2021-03-25
> 02:23:50,202] DEBUG: Downloading
> /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a/medusa-restore-197b6c82-4cd5-4c5b-b3c2-9d98863c1b3f
> as single part
> [2021-03-25 02:23:50,254] DEBUG: https://s3.cn-north-1.amazonaws.com.cn:443
> "GET /XX/XX/10.44.XX.XX/2021031803/meta/tokenmap.json?Version=2006-03-01
> HTTP/1.1" 200 1535[2021-03-25 02:23:50,255] INFO: Stopping Cassandra
> + /usr/bin/nodetool u cassandra -pw if9te8ohKei9xaep drain+ /usr/bin/nodetool
> -u cassandra -pw if9te8ohKei9xaep drainerror: null- StackTrace
> --java.io.EOFException at
> java.io.DataInputStream.readByte(DataInputStream.java:267) at
> sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:222) at
> sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at
> com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at
> javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020)
> at
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
> at com.sun.proxy.$Proxy8.drain(Unknown Source) at
> org.apache.cassandra.tools.NodeProbe.drain(NodeProbe.java:371) at
> org.apache.cassandra.tools.nodetool.Drain.execute(Drain.java:36) at
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:244) at
> org.apache.cassandra.tools.NodeTool.main(NodeTool.java:158)
> + ls -l /var/run/cassandra/cassandra.pidls: cannot access
> /var/run/cassandra/cassandra.pid: No such file or directory+ sleep 10+ echo
> -n 'Shutdown Cassandra: 'Shutdown Cassandra: ++ cat
> /var/run/cassandra/cassandra.pidcat: /var/run/cassandra/cassandra.pid: No
> such file or directory+ su cassandra -c 'kill 'kill: usage: kill [-s sigspec
> | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]++ seq 40+ for
> t in '`seq 40`'+ /etc/init.d/cassandra status+ break+ sleep 5+ echo OKOK
> {code}
> But we can get a successful run of the command on one node for
> {code:java}
> export LC_ALL=en_US.UTF-8; export LANG=en_US.UTF-8; export
> https_proxy=http://proxy.XX:3128 ; export
> PATH=$PATH:/usr/share/cassandra-medusa/bin; sudo su; mkdir
> /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a; cd
> /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a;
> medusa-wrapper sudo
> medusa -vvv restore-node --in-place --no-verify --backup-name 2021031803
> --temp-dir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a{code}
> We are running the command on
> {code:java}
> uname -a
> Linux XXXX 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC
> 2020 x86_64 x86_64 x86_64 GNU/Linux{code}
> Could you please have a look at the issue?
> Thanks
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]