What's going on in the logs? CPU? i/o? On Thu, Mar 31, 2011 at 4:20 AM, Or Yanay <o...@peer39.com> wrote:
> Hi all, > > > > My production cluster reads got stuck. > > The ring gives: > > > > Address Status State Load Owns > Token > > > 146231632500721020374621781629360107476 > > > 10.39.21.7 Up Normal 118.86 GB 18.15% > 6968792681466807915334918525105891681 > > 10.39.21.2 Up Normal 170.37 GB 33.20% > 63458945745812644657648926377562798568 > > 10.39.21.4 Up Normal 129.49 GB 2.09% > 67020233994527804731783987345291668992 > > 10.39.21.3 Up Normal 118.57 GB 31.26% > 120208618942813734646032022699594259441 > > 10.39.21.6 Up Normal 171.03 GB 15.29% > 146231632500721020374621781629360107476 > > > > The 2% bit struck me as odd, so I ran tpstats on 10.39.21.4 and got: > > Pool Name > > Active > > Pending > > Completed > > ReadStage > > 0 > > 0 > > 143370 > > *RequestResponseStage* > > *8* > > *1231283* > > *414467* > > MutationStage > > 0 > > 0 > > 1772203 > > ReadRepair > > 0 > > 0 > > 7678 > > GossipStage > > 0 > > 0 > > 204797 > > AntiEntropyStage > > 0 > > 0 > > 0 > > MigrationStage > > 0 > > 0 > > 0 > > MemtablePostFlusher > > 0 > > 0 > > 48 > > StreamStage > > 0 > > 0 > > 0 > > FlushWriter > > 0 > > 0 > > 48 > > FILEUTILS-DELETE-POOL > > 0 > > 0 > > 46 > > MiscStage > > 0 > > 0 > > 0 > > FlushSorter > > 0 > > 0 > > 0 > > InternalResponseStage > > 0 > > 0 > > 0 > > HintedHandoff > > 0 > > 0 > > 6 > > > > So… something got terribly wrong. > > Can anyone suggest what should do next to fix this? > > > > Thanks. > > -Orr > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com