Do you have a firewall between the two DCs? If yes, "connection
reset" can be caused by Cassandra trying to use a TCP connection which is
already closed by the firewall. Please make sure that you set high connection
timeout at firewall. Also, make sure your servers are not overloaded. Please
for general causes of connection reset. Also, as I told earlier, Cassandra
troubleshooting explains it well
. Make sure firewall and node tcp settings are in sync such that nodes close a
tcp connection before firewall does that.
With firewall timeout, we generally see merkle tree request/response failing
between nodes in two DCs and then repair is hung for ever. Not sure how merkle
tree creation which is node specific would get impacted by multi dc setup. Are
repairs with -local options completing without problems?