great, that is what the gc was telling us; topic 'dest:1:accounts' was retaining messages for some durable sub that had not yet received/acked so the related data files could not be gc'ed
On 9 May 2011 15:12, James Green <[email protected]> wrote: > Fixed it. > > Went to the subscribers pane of the web console and spotted an old machine > that had an accounts durable subscription in "NC" mode - the client was long > dead. > > Thankfully was able to delete it and after a pause quite a few of the data > files were GC'd. Going to repeat this for the other 'NC' clients and hope to > have further clean up happen. > > James > > On 9 May 2011 14:43, James Green <[email protected]> wrote: > >> Gary, >> >> Let me check I understand you correctly. >> >> If I see the GC kick in and list lots of references after lots of channels >> are considered, then the list suddenly drops by a large amount, that channel >> is likely to have considerable references (even though they are likely >> dead)? >> >> This is what I see: >> 2011-05-09 14:29:27,367 [eckpoint Worker] TRACE >> MessageDatabase - gc candidates after >> dest:1:Requests.DeliveryNotificationsRebuild, [113, 114, 117, 118, 121, 122, >> 123, 134, 135, 136, 138, 139, 140, 143, 144, 148, 149, 152, 153, 165, 166, >> 167, 169, 170, 171, 174, 175, 178, 179, 180, 183, 184, 196, 197, 200, 201, >> 202, 205, 206, 209, 210, 211, 214, 215, 217, 218, 219, 222, 223, 226, 227, >> 228, 231, 232, 245, 246, 249, 250, 254, 255, 258, 259, 263, 264, 276, 277, >> 280, 281, 282, 285, 286, 287, 289, 290, 291, 294, 295, 296, 308, 309, 312, >> 313, 314, 317, 318, 319, 322, 323, 327, 328, 340, 341, 344, 345, 346, 349, >> 350, 351, 354, 355, 356, 358, 359, 360, 372, 373, 374, 377, 378, 379, 381, >> 382, 383, 386, 387, 391, 392, 414, 419, 423, 424, 441, 442, 445, 446, 447, >> 450, 451, 455, 457, 469, 470, 474, 475, 476, 479, 480, 481, 485, 486, 487, >> 490, 491, 492, 507, 508, 512, 513, 514, 519, 520, 521, 524, 525, 526, 529, >> 530, 531, 545, 546, 547, 551, 552, 553, 556, 557, 558, 563, 564, 567, 568, >> 569, 582, 583, 584, 585, 589, 590, 591, 595, 596, 597, 600, 601, 602, 603, >> 606, 622, 623, 624, 628, 629, 630, 634, 636, 640, 641, 642, 645, 646, 647, >> 661, 662, 663, 666, 667, 668, 669, 672, 673, 674, 678, 681, 684, 685, 686, >> 699, 700, 701, 702, 705, 706, 707, 710, 711, 712, 713, 719, 722, 723, 724, >> 737, 738, 739, 740, 743, 744, 745, 749, 750, 751, 752, 755, 756, 757, 760, >> 761, 762, 775, 776, 777, 778, 781, 782, 783, 784, 787, 788, 789, 790, 794, >> 795, 796, 799, 800, 801, 812, 813, 814, 815, 819, 821, 824, 825, 826, 832, >> 833, 834, 837, 838, 839, 852, 853, 854, 855, 859, 860, 865, 866, 867, 870, >> 871, 872, 876, 891, 892, 893, 897, 898, 899, 902, 903, 904, 908, 909, 910, >> 911, 914, 915, 916, 930, 932, 936, 937, 938, 942, 943, 947, 948, 949, 968, >> 969, 973, 974, 975, 979, 980, 985, 986, 987, 990, 991, 992, 1006, 1007, >> 1008, 1012, 1013, 1017, 1019, 1022, 1024, 1053, 1054, 1058, 1060, 1082, >> 1083, 1084, 1085, 1088, 1090, 1091, 1094, 1095, 1096, 1099, 1100, 1101, >> 1102] >> 2011-05-09 14:29:27,367 [eckpoint Worker] TRACE >> MessageDatabase - gc candidates after dest:1:accounts, [553, >> 556, 1102] >> >> accounts is a topic, consumed by each of "production boxes". I am not aware >> of any problems this respect. Each consumer is durable. Each message is >> processed then ACKed. >> >> Can you suggest any reason this situation is occurring? >> >> Is there a way to list the contents of these data files in a more >> meaningful way? To list for example references to other data files as you >> suggest? >> >> Thanks, >> >> James >> >> >> On 6 May 2011 17:57, Gary Tully <[email protected]> wrote: >> >>> reading the trace output is a little unintuitive as it follows the code >>> logic... >>> so it starts with the entire set of data files and considers them all >>> as gc candidates. >>> Then it asks each destination in turn if it still has pending >>> references and if so >>> removes them from the gc candidate set. >>> >>> The list should get smaller as destinations grab data files. >>> >>> In the case below, it looks like after asking >>> dest:0:Outbound.Account.22312, there are still >>> lots of data files that will be ok to gc. >>> >>> The second step is to determine if the data files contain acks for >>> referenced data files. Deleting them would mean after a >>> failure/recovery restart, the acks would be gone and the messages in >>> the referenced data files would be replayed in error. >>> >>> That further reduces the candidate list. >>> >>> So you need to look for the channel that pulls the most from the gc >>> candidate list. >>> >>> >>> On 6 May 2011 17:18, James Green <[email protected]> wrote: >>> > OK to take but one channel: >>> > 2011-05-06 17:16:25,154 [eckpoint Worker] TRACE >>> > MessageDatabase - gc candidates after >>> > dest:0:Outbound.Account.22312, [113, 114, 117, 118, 121, 122, 123, 134, >>> 135, >>> > 136, 138, 139, 140, 143, 144, 148, 149, 152, 153, 165, 166, 167, 169, >>> 170, >>> > 171, 174, 175, 178, 179, 180, 183, 184, 196, 197, 200, 201, 202, 205, >>> 206, >>> > 209, 210, 211, 214, 215, 217, 218, 219, 222, 223, 226, 227, 228, 231, >>> 232, >>> > 245, 246, 249, 250, 254, 255, 258, 259, 263, 264, 276, 277, 280, 281, >>> 282, >>> > 285, 286, 287, 289, 290, 291, 294, 295, 296, 308, 309, 312, 313, 314, >>> 317, >>> > 318, 319, 322, 323, 327, 328, 340, 341, 344, 345, 346, 349, 350, 351, >>> 354, >>> > 355, 356, 358, 359, 360, 372, 373, 374, 377, 378, 379, 381, 382, 383, >>> 386, >>> > 387, 391, 392, 414, 419, 423, 424, 441, 442, 445, 446, 447, 450, 451, >>> 455, >>> > 457, 469, 470, 474, 475, 476, 479, 480, 481, 485, 486, 487, 490, 491, >>> 492, >>> > 507, 508, 512, 513, 514, 519, 520, 521, 524, 525, 526, 529, 530, 531, >>> 545, >>> > 546, 547, 551, 552, 553, 556, 557, 558, 563, 564, 567, 568, 569, 582, >>> 583, >>> > 584, 585, 589, 590, 591, 595, 596, 597, 600, 601, 602, 603, 606, 622, >>> 623, >>> > 624, 628, 629, 630, 634, 636, 640, 641, 642, 645, 646, 647, 661, 662, >>> 663, >>> > 666, 667, 668, 669, 672, 673, 674, 678, 681, 684, 685, 686, 699, 700, >>> 701, >>> > 702, 705, 706, 707, 710, 711, 712, 713, 719, 722, 723, 724, 737, 738, >>> 739, >>> > 740, 743, 744, 745, 749, 750, 751, 752, 755, 756, 757, 760, 761, 762, >>> 775, >>> > 776, 777, 778, 781, 782, 783, 784, 787, 788, 789, 790, 794, 795, 796, >>> 799, >>> > 800, 801, 812, 813, 814, 815, 819, 821, 824, 825, 826, 832, 833, 834, >>> 837, >>> > 838, 839, 852, 853, 854, 855, 859, 860, 865, 866, 867, 870, 871, 872, >>> 876, >>> > 891, 892, 893, 897, 898, 899, 902, 903, 904, 908, 909, 910, 911, 914, >>> 915, >>> > 916, 930, 932, 936, 937, 938, 942, 943, 947, 948, 949, 968, 969, 973, >>> 974, >>> > 975, 979, 980, 985, 986, 987, 990, 991, 992, 1006, 1007, 1008, 1012, >>> 1013, >>> > 1017, 1019, 1022, 1024, 1053, 1054, 1058, 1060, 1082, 1083, 1084, 1085, >>> > 1088, 1090, 1091, 1094, 1095, 1096, 1099, 1100] >>> > >>> > That channel would only ever have messages sent/received on a remote >>> > machines. The messages would never go over the network. >>> > >>> > Clearly that's a lot of references which should not be there. Any ideas? >>> > >>> > James >>> > >>> > On 6 May 2011 15:01, Gary Tully <[email protected]> wrote: >>> > >>> >> on the broker, enable TRACE level logging for: >>> >> org.apache.activemq.store.kahadb.MessageDatabase >>> >> >>> >> and the cleanup processing will tell you which destination has a >>> >> reference to those data files. >>> >> >>> >> On 6 May 2011 09:26, James Green <[email protected]> wrote: >>> >> > Ubuntu Linux running AMQ 5.5.0, previously running 5.4.x releases. >>> >> > >>> >> > I have just noticed our "hub" machine has 12% store used. df -h >>> inside >>> >> the >>> >> > kahadb dir shows 357 .log files consuming 12G of space. They begin >>> Oct >>> >> 2010 >>> >> > - there are no obvious large gaps over time but some files are >>> clearly >>> >> gone. >>> >> > >>> >> > Looking at lsof only three are currently open. The hub receives >>> messages >>> >> on >>> >> > queues and publishes messages on topics. >>> >> > >>> >> > Can anyone advise on investigation work please. >>> >> > >>> >> > James >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> http://blog.garytully.com >>> >> http://fusesource.com >>> >> >>> > >>> >>> >>> >>> -- >>> http://blog.garytully.com >>> http://fusesource.com >>> >> >> > -- http://blog.garytully.com http://fusesource.com
