Keith Turner wrote:
7e56b58a0c7df128 5fa0:6249 [] 1411499311578
>
>
3a10885b-d481-4d00-be00-0477e231e965:0000p000872d60eb:499fa72752d82a7c:5c5f19e8
>
> which both happened a little after 3:00pm eastern (I stopped CI around
> 3:30pm eastern). I don't see anything immediately wrong in the tserver
> logs (nor does it appear that I had restarted either of them around
> the timestamp of the above keys). I see no errors in the DN logs
> either around that time window.
>
> I don't have a clue how to even start looking at this to figure out if
>
If you had turned on archiving of walogs, you could look in the walog and
see if the data matches.
You can also see if this data was written around the time of a kill event.
Every CI entry has counter and ingester id. Using the counter and ingester
ID, you can look in the ingesters log file and find a time range for when
that data was ingested. Using that info you can determine what tablet it
was written to and where that tablet was assigned at the time.
If I can't find any other reason that might have caused the failure,
I'll have to re-run with walog archiving turned on.
I checked the tserver logs and neither were killed around the time the
anomalies occurred.