Re: 12G datastore - there should be virtually nothing

James Green Mon, 09 May 2011 06:43:33 -0700

Gary,

Let me check I understand you correctly.


If I see the GC kick in and list lots of references after lots of channels
are considered, then the list suddenly drops by a large amount, that channel
is likely to have considerable references (even though they are likely
dead)?

This is what I see:
2011-05-09 14:29:27,367 [eckpoint Worker] TRACE
MessageDatabase                - gc candidates after
dest:1:Requests.DeliveryNotificationsRebuild, [113, 114, 117, 118, 121, 122,
123, 134, 135, 136, 138, 139, 140, 143, 144, 148, 149, 152, 153, 165, 166,
167, 169, 170, 171, 174, 175, 178, 179, 180, 183, 184, 196, 197, 200, 201,
202, 205, 206, 209, 210, 211, 214, 215, 217, 218, 219, 222, 223, 226, 227,
228, 231, 232, 245, 246, 249, 250, 254, 255, 258, 259, 263, 264, 276, 277,
280, 281, 282, 285, 286, 287, 289, 290, 291, 294, 295, 296, 308, 309, 312,
313, 314, 317, 318, 319, 322, 323, 327, 328, 340, 341, 344, 345, 346, 349,
350, 351, 354, 355, 356, 358, 359, 360, 372, 373, 374, 377, 378, 379, 381,
382, 383, 386, 387, 391, 392, 414, 419, 423, 424, 441, 442, 445, 446, 447,
450, 451, 455, 457, 469, 470, 474, 475, 476, 479, 480, 481, 485, 486, 487,
490, 491, 492, 507, 508, 512, 513, 514, 519, 520, 521, 524, 525, 526, 529,
530, 531, 545, 546, 547, 551, 552, 553, 556, 557, 558, 563, 564, 567, 568,
569, 582, 583, 584, 585, 589, 590, 591, 595, 596, 597, 600, 601, 602, 603,
606, 622, 623, 624, 628, 629, 630, 634, 636, 640, 641, 642, 645, 646, 647,
661, 662, 663, 666, 667, 668, 669, 672, 673, 674, 678, 681, 684, 685, 686,
699, 700, 701, 702, 705, 706, 707, 710, 711, 712, 713, 719, 722, 723, 724,
737, 738, 739, 740, 743, 744, 745, 749, 750, 751, 752, 755, 756, 757, 760,
761, 762, 775, 776, 777, 778, 781, 782, 783, 784, 787, 788, 789, 790, 794,
795, 796, 799, 800, 801, 812, 813, 814, 815, 819, 821, 824, 825, 826, 832,
833, 834, 837, 838, 839, 852, 853, 854, 855, 859, 860, 865, 866, 867, 870,
871, 872, 876, 891, 892, 893, 897, 898, 899, 902, 903, 904, 908, 909, 910,
911, 914, 915, 916, 930, 932, 936, 937, 938, 942, 943, 947, 948, 949, 968,
969, 973, 974, 975, 979, 980, 985, 986, 987, 990, 991, 992, 1006, 1007,
1008, 1012, 1013, 1017, 1019, 1022, 1024, 1053, 1054, 1058, 1060, 1082,
1083, 1084, 1085, 1088, 1090, 1091, 1094, 1095, 1096, 1099, 1100, 1101,
1102]
2011-05-09 14:29:27,367 [eckpoint Worker] TRACE
MessageDatabase                - gc candidates after dest:1:accounts, [553,
556, 1102]

accounts is a topic, consumed by each of "production boxes". I am not aware
of any problems this respect. Each consumer is durable. Each message is
processed then ACKed.

Can you suggest any reason this situation is occurring?

Is there a way to list the contents of these data files in a more meaningful
way? To list for example references to other data files as you suggest?

Thanks,

James

On 6 May 2011 17:57, Gary Tully <[email protected]> wrote:

> reading the trace output is a little unintuitive as it follows the code
> logic...
> so it starts with the entire set of data files and considers them all
> as gc candidates.
> Then it asks each destination in turn if it still has pending
> references and if so
> removes them from the gc candidate set.
>
> The list should get smaller as destinations grab data files.
>
> In the case below, it looks like after asking
> dest:0:Outbound.Account.22312, there are still
> lots of data files that will be ok to gc.
>
> The second step is to determine if the data files contain acks for
> referenced data files. Deleting them would mean after a
> failure/recovery restart, the acks would be gone and the messages in
> the referenced data files would be replayed in error.
>
> That further reduces the candidate list.
>
> So you need to look for the channel that pulls the most from the gc
> candidate list.
>
>
> On 6 May 2011 17:18, James Green <[email protected]> wrote:
> > OK to take but one channel:
> > 2011-05-06 17:16:25,154 [eckpoint Worker] TRACE
> > MessageDatabase                - gc candidates after
> > dest:0:Outbound.Account.22312, [113, 114, 117, 118, 121, 122, 123, 134,
> 135,
> > 136, 138, 139, 140, 143, 144, 148, 149, 152, 153, 165, 166, 167, 169,
> 170,
> > 171, 174, 175, 178, 179, 180, 183, 184, 196, 197, 200, 201, 202, 205,
> 206,
> > 209, 210, 211, 214, 215, 217, 218, 219, 222, 223, 226, 227, 228, 231,
> 232,
> > 245, 246, 249, 250, 254, 255, 258, 259, 263, 264, 276, 277, 280, 281,
> 282,
> > 285, 286, 287, 289, 290, 291, 294, 295, 296, 308, 309, 312, 313, 314,
> 317,
> > 318, 319, 322, 323, 327, 328, 340, 341, 344, 345, 346, 349, 350, 351,
> 354,
> > 355, 356, 358, 359, 360, 372, 373, 374, 377, 378, 379, 381, 382, 383,
> 386,
> > 387, 391, 392, 414, 419, 423, 424, 441, 442, 445, 446, 447, 450, 451,
> 455,
> > 457, 469, 470, 474, 475, 476, 479, 480, 481, 485, 486, 487, 490, 491,
> 492,
> > 507, 508, 512, 513, 514, 519, 520, 521, 524, 525, 526, 529, 530, 531,
> 545,
> > 546, 547, 551, 552, 553, 556, 557, 558, 563, 564, 567, 568, 569, 582,
> 583,
> > 584, 585, 589, 590, 591, 595, 596, 597, 600, 601, 602, 603, 606, 622,
> 623,
> > 624, 628, 629, 630, 634, 636, 640, 641, 642, 645, 646, 647, 661, 662,
> 663,
> > 666, 667, 668, 669, 672, 673, 674, 678, 681, 684, 685, 686, 699, 700,
> 701,
> > 702, 705, 706, 707, 710, 711, 712, 713, 719, 722, 723, 724, 737, 738,
> 739,
> > 740, 743, 744, 745, 749, 750, 751, 752, 755, 756, 757, 760, 761, 762,
> 775,
> > 776, 777, 778, 781, 782, 783, 784, 787, 788, 789, 790, 794, 795, 796,
> 799,
> > 800, 801, 812, 813, 814, 815, 819, 821, 824, 825, 826, 832, 833, 834,
> 837,
> > 838, 839, 852, 853, 854, 855, 859, 860, 865, 866, 867, 870, 871, 872,
> 876,
> > 891, 892, 893, 897, 898, 899, 902, 903, 904, 908, 909, 910, 911, 914,
> 915,
> > 916, 930, 932, 936, 937, 938, 942, 943, 947, 948, 949, 968, 969, 973,
> 974,
> > 975, 979, 980, 985, 986, 987, 990, 991, 992, 1006, 1007, 1008, 1012,
> 1013,
> > 1017, 1019, 1022, 1024, 1053, 1054, 1058, 1060, 1082, 1083, 1084, 1085,
> > 1088, 1090, 1091, 1094, 1095, 1096, 1099, 1100]
> >
> > That channel would only ever have messages sent/received on a remote
> > machines. The messages would never go over the network.
> >
> > Clearly that's a lot of references which should not be there. Any ideas?
> >
> > James
> >
> > On 6 May 2011 15:01, Gary Tully <[email protected]> wrote:
> >
> >> on the broker, enable TRACE level logging for:
> >> org.apache.activemq.store.kahadb.MessageDatabase
> >>
> >> and the cleanup processing will tell you which destination has a
> >> reference to those data files.
> >>
> >> On 6 May 2011 09:26, James Green <[email protected]> wrote:
> >> > Ubuntu Linux running AMQ 5.5.0, previously running 5.4.x releases.
> >> >
> >> > I have just noticed our "hub" machine has 12% store used. df -h inside
> >> the
> >> > kahadb dir shows 357 .log files consuming 12G of space. They begin Oct
> >> 2010
> >> > - there are no obvious large gaps over time but some files are clearly
> >> gone.
> >> >
> >> > Looking at lsof only three are currently open. The hub receives
> messages
> >> on
> >> > queues and publishes messages on topics.
> >> >
> >> > Can anyone advise on investigation work please.
> >> >
> >> > James
> >> >
> >>
> >>
> >>
> >> --
> >> http://blog.garytully.com
> >> http://fusesource.com
> >>
> >
>
>
>
> --
> http://blog.garytully.com
> http://fusesource.com
>

Re: 12G datastore - there should be virtually nothing

Reply via email to