Gary, Let me check I understand you correctly.
If I see the GC kick in and list lots of references after lots of channels are considered, then the list suddenly drops by a large amount, that channel is likely to have considerable references (even though they are likely dead)? This is what I see: 2011-05-09 14:29:27,367 [eckpoint Worker] TRACE MessageDatabase - gc candidates after dest:1:Requests.DeliveryNotificationsRebuild, [113, 114, 117, 118, 121, 122, 123, 134, 135, 136, 138, 139, 140, 143, 144, 148, 149, 152, 153, 165, 166, 167, 169, 170, 171, 174, 175, 178, 179, 180, 183, 184, 196, 197, 200, 201, 202, 205, 206, 209, 210, 211, 214, 215, 217, 218, 219, 222, 223, 226, 227, 228, 231, 232, 245, 246, 249, 250, 254, 255, 258, 259, 263, 264, 276, 277, 280, 281, 282, 285, 286, 287, 289, 290, 291, 294, 295, 296, 308, 309, 312, 313, 314, 317, 318, 319, 322, 323, 327, 328, 340, 341, 344, 345, 346, 349, 350, 351, 354, 355, 356, 358, 359, 360, 372, 373, 374, 377, 378, 379, 381, 382, 383, 386, 387, 391, 392, 414, 419, 423, 424, 441, 442, 445, 446, 447, 450, 451, 455, 457, 469, 470, 474, 475, 476, 479, 480, 481, 485, 486, 487, 490, 491, 492, 507, 508, 512, 513, 514, 519, 520, 521, 524, 525, 526, 529, 530, 531, 545, 546, 547, 551, 552, 553, 556, 557, 558, 563, 564, 567, 568, 569, 582, 583, 584, 585, 589, 590, 591, 595, 596, 597, 600, 601, 602, 603, 606, 622, 623, 624, 628, 629, 630, 634, 636, 640, 641, 642, 645, 646, 647, 661, 662, 663, 666, 667, 668, 669, 672, 673, 674, 678, 681, 684, 685, 686, 699, 700, 701, 702, 705, 706, 707, 710, 711, 712, 713, 719, 722, 723, 724, 737, 738, 739, 740, 743, 744, 745, 749, 750, 751, 752, 755, 756, 757, 760, 761, 762, 775, 776, 777, 778, 781, 782, 783, 784, 787, 788, 789, 790, 794, 795, 796, 799, 800, 801, 812, 813, 814, 815, 819, 821, 824, 825, 826, 832, 833, 834, 837, 838, 839, 852, 853, 854, 855, 859, 860, 865, 866, 867, 870, 871, 872, 876, 891, 892, 893, 897, 898, 899, 902, 903, 904, 908, 909, 910, 911, 914, 915, 916, 930, 932, 936, 937, 938, 942, 943, 947, 948, 949, 968, 969, 973, 974, 975, 979, 980, 985, 986, 987, 990, 991, 992, 1006, 1007, 1008, 1012, 1013, 1017, 1019, 1022, 1024, 1053, 1054, 1058, 1060, 1082, 1083, 1084, 1085, 1088, 1090, 1091, 1094, 1095, 1096, 1099, 1100, 1101, 1102] 2011-05-09 14:29:27,367 [eckpoint Worker] TRACE MessageDatabase - gc candidates after dest:1:accounts, [553, 556, 1102] accounts is a topic, consumed by each of "production boxes". I am not aware of any problems this respect. Each consumer is durable. Each message is processed then ACKed. Can you suggest any reason this situation is occurring? Is there a way to list the contents of these data files in a more meaningful way? To list for example references to other data files as you suggest? Thanks, James On 6 May 2011 17:57, Gary Tully <[email protected]> wrote: > reading the trace output is a little unintuitive as it follows the code > logic... > so it starts with the entire set of data files and considers them all > as gc candidates. > Then it asks each destination in turn if it still has pending > references and if so > removes them from the gc candidate set. > > The list should get smaller as destinations grab data files. > > In the case below, it looks like after asking > dest:0:Outbound.Account.22312, there are still > lots of data files that will be ok to gc. > > The second step is to determine if the data files contain acks for > referenced data files. Deleting them would mean after a > failure/recovery restart, the acks would be gone and the messages in > the referenced data files would be replayed in error. > > That further reduces the candidate list. > > So you need to look for the channel that pulls the most from the gc > candidate list. > > > On 6 May 2011 17:18, James Green <[email protected]> wrote: > > OK to take but one channel: > > 2011-05-06 17:16:25,154 [eckpoint Worker] TRACE > > MessageDatabase - gc candidates after > > dest:0:Outbound.Account.22312, [113, 114, 117, 118, 121, 122, 123, 134, > 135, > > 136, 138, 139, 140, 143, 144, 148, 149, 152, 153, 165, 166, 167, 169, > 170, > > 171, 174, 175, 178, 179, 180, 183, 184, 196, 197, 200, 201, 202, 205, > 206, > > 209, 210, 211, 214, 215, 217, 218, 219, 222, 223, 226, 227, 228, 231, > 232, > > 245, 246, 249, 250, 254, 255, 258, 259, 263, 264, 276, 277, 280, 281, > 282, > > 285, 286, 287, 289, 290, 291, 294, 295, 296, 308, 309, 312, 313, 314, > 317, > > 318, 319, 322, 323, 327, 328, 340, 341, 344, 345, 346, 349, 350, 351, > 354, > > 355, 356, 358, 359, 360, 372, 373, 374, 377, 378, 379, 381, 382, 383, > 386, > > 387, 391, 392, 414, 419, 423, 424, 441, 442, 445, 446, 447, 450, 451, > 455, > > 457, 469, 470, 474, 475, 476, 479, 480, 481, 485, 486, 487, 490, 491, > 492, > > 507, 508, 512, 513, 514, 519, 520, 521, 524, 525, 526, 529, 530, 531, > 545, > > 546, 547, 551, 552, 553, 556, 557, 558, 563, 564, 567, 568, 569, 582, > 583, > > 584, 585, 589, 590, 591, 595, 596, 597, 600, 601, 602, 603, 606, 622, > 623, > > 624, 628, 629, 630, 634, 636, 640, 641, 642, 645, 646, 647, 661, 662, > 663, > > 666, 667, 668, 669, 672, 673, 674, 678, 681, 684, 685, 686, 699, 700, > 701, > > 702, 705, 706, 707, 710, 711, 712, 713, 719, 722, 723, 724, 737, 738, > 739, > > 740, 743, 744, 745, 749, 750, 751, 752, 755, 756, 757, 760, 761, 762, > 775, > > 776, 777, 778, 781, 782, 783, 784, 787, 788, 789, 790, 794, 795, 796, > 799, > > 800, 801, 812, 813, 814, 815, 819, 821, 824, 825, 826, 832, 833, 834, > 837, > > 838, 839, 852, 853, 854, 855, 859, 860, 865, 866, 867, 870, 871, 872, > 876, > > 891, 892, 893, 897, 898, 899, 902, 903, 904, 908, 909, 910, 911, 914, > 915, > > 916, 930, 932, 936, 937, 938, 942, 943, 947, 948, 949, 968, 969, 973, > 974, > > 975, 979, 980, 985, 986, 987, 990, 991, 992, 1006, 1007, 1008, 1012, > 1013, > > 1017, 1019, 1022, 1024, 1053, 1054, 1058, 1060, 1082, 1083, 1084, 1085, > > 1088, 1090, 1091, 1094, 1095, 1096, 1099, 1100] > > > > That channel would only ever have messages sent/received on a remote > > machines. The messages would never go over the network. > > > > Clearly that's a lot of references which should not be there. Any ideas? > > > > James > > > > On 6 May 2011 15:01, Gary Tully <[email protected]> wrote: > > > >> on the broker, enable TRACE level logging for: > >> org.apache.activemq.store.kahadb.MessageDatabase > >> > >> and the cleanup processing will tell you which destination has a > >> reference to those data files. > >> > >> On 6 May 2011 09:26, James Green <[email protected]> wrote: > >> > Ubuntu Linux running AMQ 5.5.0, previously running 5.4.x releases. > >> > > >> > I have just noticed our "hub" machine has 12% store used. df -h inside > >> the > >> > kahadb dir shows 357 .log files consuming 12G of space. They begin Oct > >> 2010 > >> > - there are no obvious large gaps over time but some files are clearly > >> gone. > >> > > >> > Looking at lsof only three are currently open. The hub receives > messages > >> on > >> > queues and publishes messages on topics. > >> > > >> > Can anyone advise on investigation work please. > >> > > >> > James > >> > > >> > >> > >> > >> -- > >> http://blog.garytully.com > >> http://fusesource.com > >> > > > > > > -- > http://blog.garytully.com > http://fusesource.com >
