Hi All, 3 times in the past few weeks (twice on 1 system, once on another), the master gets UnknownHostException (s), one by one, for each of the tablet servers. Then, it wants to stop them. Eventually, all the tablet servers quit.
It goes like this for all the tablet servers: 12 08:14:01,0498tserver:620 ERROR error sending update to tserver3:9997: org.apache.thrift.transport.TTransportException: java.net.UnknownHostException 12 09:01:53,0352master:12 ERROR org.apache.thrift.transport.TTransportException: java.net.UnknownHostException 12 16:35:50,0672master:110 ERROR unable to get tablet server status tserver3:9997[250e6cd2c500012] org.apache.thrift.transport.TTransportException: java.net.UnknownHostException I've redacted the real host names, of course. This could be a DNS problem, though the system was running fine for days before this happened (same scenario on the 2 systems with really quite different DNS servers). If any one has a hint or seen something like this, I would appreciate any pointers. I have looked at the JIRA issues regarding DNS outages, but nothing seems to fit this pattern. Thanks -- Josef Roehrl Senior Software Developer *PHEMI Systems* 180-887 Great Northern Way Vancouver, BC V5T 4T5 604-336-1119 Website <http://www.phemi.com/> Twitter <https://twitter.com/PHEMISystems> Linkedin <http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>
