Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
- Original Message - > On Tue, 2018-06-26 at 11:59 -0400, Dave Anderson wrote: > > > > - Original Message - > > > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: > > > > > > > > - Original Message - > > > > > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > > > > > > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > > > > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > > > > > > > Yes, by default all list entries encountered are put in the > > > > > > > > built-in > > > > > > > > hash queue, specifically for the purpose of determining whether > > > > > > > > there > > > > > > > > are duplicate entries. So if it's still running, it hasn't > > > > > > > > found > > > > > > > > any. > > > > > > > > > > > > > > > > To avoid the use of the hashing feature, try entering "set hash > > > > > > > > off" > > > > > > > > before kicking off the command. But of course if it finds any, > > > > > > > > it > > > > > > > > will loop forever. > > > > > > > > > > > > > > > > > > > > > > Ah ok yeah I forgot about the built-in list loop detection! > > > > > > > > > > > > For a storage-less method of list loop-detection: run two walkers > > > > > > down the list, advancing two versus one elements. If you ever > > > > > > match the same element location after starting, you have a loop. > > > > > > > > > > I agree some algorithm [1] without a hash table may be better > > > > > especially for larger lists. > > > > > > > > I'll await your patch... > > > > > > > > > > Do you see any advantage to keeping the hash table for loop detection > > > or would you accept a patch that removes it completely in favor of a > > > another algorithm? > > > > For maintenance sake, it's probably worth keeping the hash queue option > > in place, primarily since there are a few dozen other internal facilities > > besides the user "list" command that use the do_list() function. > > > > Ok. I was actually thinking about replacing all the callers of > hq_enter() with a new algorithm. There are 32 of them so it would be a > non-trivial patch though longer term may be simpler maintenance. > Probably only now it matters more since it's much more common to have > vmcores 100's of GB and much larger lists but maybe some of these it > would not be useful replacement. Duplicate list detection is not the only reason for the hq_enter() facility. It has to stay in place, because after collecting a list, several hq_enter() users need to read all of the entries back out for whatever post-processing they need to do. So it can't just be a "repeal-and-replace"... ;-) Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
- Original Message - > On Tue, 2018-06-26 at 11:59 -0400, Dave Anderson wrote: > > > > - Original Message - > > > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: > > > > > > > > - Original Message - > > > > > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > > > > > > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > > > > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > > > > > > > Yes, by default all list entries encountered are put in the > > > > > > > > built-in > > > > > > > > hash queue, specifically for the purpose of determining whether > > > > > > > > there > > > > > > > > are duplicate entries. So if it's still running, it hasn't > > > > > > > > found > > > > > > > > any. > > > > > > > > > > > > > > > > To avoid the use of the hashing feature, try entering "set hash > > > > > > > > off" > > > > > > > > before kicking off the command. But of course if it finds any, > > > > > > > > it > > > > > > > > will loop forever. > > > > > > > > > > > > > > > > > > > > > > Ah ok yeah I forgot about the built-in list loop detection! > > > > > > > > > > > > For a storage-less method of list loop-detection: run two walkers > > > > > > down the list, advancing two versus one elements. If you ever > > > > > > match the same element location after starting, you have a loop. > > > > > > > > > > I agree some algorithm [1] without a hash table may be better > > > > > especially for larger lists. > > > > > > > > I'll await your patch... > > > > > > > > > > Do you see any advantage to keeping the hash table for loop detection > > > or would you accept a patch that removes it completely in favor of a > > > another algorithm? > > > > For maintenance sake, it's probably worth keeping the hash queue option > > in place, primarily since there are a few dozen other internal facilities > > besides the user "list" command that use the do_list() function. > > > > Ok. I was actually thinking about replacing all the callers of > hq_enter() with a new algorithm. There are 32 of them so it would be a > non-trivial patch though longer term may be simpler maintenance. > Probably only now it matters more since it's much more common to have > vmcores 100's of GB and much larger lists but maybe some of these it > would not be useful replacement. > > I haven't read far enough whether there are instances where dumping out > the hash table would be more valuable or if it's solely for loop > detection. > > Also it's unclear to me that there would be an even distribution into > 'buckets' - HQ_INDEX() does not do much other than a simple mod so it's > possible the slowdown I saw was due to uneven distribution of the hash > function and we could improve that there. Still it would be good to > have an option of no memory usage for larger lists. Could be. That's what the "crash --hash " command line option may alleviate. But remember that we changed the default number of hash queue heads to 32768. > > FWIW, the memory usage does not seem to be freed or at least I could > not find it yet. Yep, you're right. I was thinking that the hq_close() done by restore_sanity() would free the memory even if you end the command with Ctrl-c, but it does keep it around for the next user. Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
On 06/26/2018 11:15 AM, Dave Anderson wrote: > > > - Original Message - >> On 06/26/2018 10:40 AM, David Wysochanski wrote: >>> On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: - Original Message - > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: >> On 06/26/2018 03:29 PM, David Wysochanski wrote: >>> On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: Yes, by default all list entries encountered are put in the built-in hash queue, specifically for the purpose of determining whether there are duplicate entries. So if it's still running, it hasn't found any. To avoid the use of the hashing feature, try entering "set hash off" before kicking off the command. But of course if it finds any, it will loop forever. >>> >>> Ah ok yeah I forgot about the built-in list loop detection! >> >> For a storage-less method of list loop-detection: run two walkers >> down the list, advancing two versus one elements. If you ever >> match the same element location after starting, you have a loop. > > I agree some algorithm [1] without a hash table may be better > especially for larger lists. I'll await your patch... >>> >>> Do you see any advantage to keeping the hash table for loop detection >>> or would you accept a patch that removes it completely in favor of a >>> another algorithm? >> >> Could the same algorithm be modified so that it can slow down after a >> certain number of list members, say maybe saving only every 10th element >> to the hash (but checking every new one)? > > I've seen list corruption where a list containing hundreds of entries > contains an entry that points back to another entry in the list that > came hundreds of entries before it. Right, but if we didn't detect the first duplicate, we should still see one in the next 10 entries. Once the list repeats, it should repeat identically. > > Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
On Tue, 2018-06-26 at 11:59 -0400, Dave Anderson wrote: > > - Original Message - > > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: > > > > > > - Original Message - > > > > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > > > > > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > > > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > > > > > > Yes, by default all list entries encountered are put in the > > > > > > > built-in > > > > > > > hash queue, specifically for the purpose of determining whether > > > > > > > there > > > > > > > are duplicate entries. So if it's still running, it hasn't found > > > > > > > any. > > > > > > > > > > > > > > To avoid the use of the hashing feature, try entering "set hash > > > > > > > off" > > > > > > > before kicking off the command. But of course if it finds any, it > > > > > > > will loop forever. > > > > > > > > > > > > > > > > > > > Ah ok yeah I forgot about the built-in list loop detection! > > > > > > > > > > For a storage-less method of list loop-detection: run two walkers > > > > > down the list, advancing two versus one elements. If you ever > > > > > match the same element location after starting, you have a loop. > > > > > > > > I agree some algorithm [1] without a hash table may be better > > > > especially for larger lists. > > > > > > I'll await your patch... > > > > > > > Do you see any advantage to keeping the hash table for loop detection > > or would you accept a patch that removes it completely in favor of a > > another algorithm? > > For maintenance sake, it's probably worth keeping the hash queue option > in place, primarily since there are a few dozen other internal facilities > besides the user "list" command that use the do_list() function. > Ok. I was actually thinking about replacing all the callers of hq_enter() with a new algorithm. There are 32 of them so it would be a non-trivial patch though longer term may be simpler maintenance. Probably only now it matters more since it's much more common to have vmcores 100's of GB and much larger lists but maybe some of these it would not be useful replacement. I haven't read far enough whether there are instances where dumping out the hash table would be more valuable or if it's solely for loop detection. Also it's unclear to me that there would be an even distribution into 'buckets' - HQ_INDEX() does not do much other than a simple mod so it's possible the slowdown I saw was due to uneven distribution of the hash function and we could improve that there. Still it would be good to have an option of no memory usage for larger lists. FWIW, the memory usage does not seem to be freed or at least I could not find it yet. > Dave > > > > > > > > -- > > Crash-utility mailing list > > Crash-utility@redhat.com > > https://www.redhat.com/mailman/listinfo/crash-utility > > > > -- > Crash-utility mailing list > Crash-utility@redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
- Original Message - > On 06/26/2018 10:40 AM, David Wysochanski wrote: > > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: > >> > >> - Original Message - > >>> On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > >> Yes, by default all list entries encountered are put in the built-in > >> hash queue, specifically for the purpose of determining whether there > >> are duplicate entries. So if it's still running, it hasn't found any. > >> > >> To avoid the use of the hashing feature, try entering "set hash off" > >> before kicking off the command. But of course if it finds any, it > >> will loop forever. > >> > > > > Ah ok yeah I forgot about the built-in list loop detection! > > For a storage-less method of list loop-detection: run two walkers > down the list, advancing two versus one elements. If you ever > match the same element location after starting, you have a loop. > >>> > >>> I agree some algorithm [1] without a hash table may be better > >>> especially for larger lists. > >> > >> I'll await your patch... > >> > > > > Do you see any advantage to keeping the hash table for loop detection > > or would you accept a patch that removes it completely in favor of a > > another algorithm? > > Could the same algorithm be modified so that it can slow down after a > certain number of list members, say maybe saving only every 10th element > to the hash (but checking every new one)? I've seen list corruption where a list containing hundreds of entries contains an entry that points back to another entry in the list that came hundreds of entries before it. Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
On 06/26/2018 10:40 AM, David Wysochanski wrote: > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: >> >> - Original Message - >>> On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: On 06/26/2018 03:29 PM, David Wysochanski wrote: > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: >> Yes, by default all list entries encountered are put in the built-in >> hash queue, specifically for the purpose of determining whether there >> are duplicate entries. So if it's still running, it hasn't found any. >> >> To avoid the use of the hashing feature, try entering "set hash off" >> before kicking off the command. But of course if it finds any, it >> will loop forever. >> > > Ah ok yeah I forgot about the built-in list loop detection! For a storage-less method of list loop-detection: run two walkers down the list, advancing two versus one elements. If you ever match the same element location after starting, you have a loop. >>> >>> I agree some algorithm [1] without a hash table may be better >>> especially for larger lists. >> >> I'll await your patch... >> > > Do you see any advantage to keeping the hash table for loop detection > or would you accept a patch that removes it completely in favor of a > another algorithm? Could the same algorithm be modified so that it can slow down after a certain number of list members, say maybe saving only every 10th element to the hash (but checking every new one)? -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
- Original Message - > On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: > > > > - Original Message - > > > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > > > > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > > > > > Yes, by default all list entries encountered are put in the > > > > > > built-in > > > > > > hash queue, specifically for the purpose of determining whether > > > > > > there > > > > > > are duplicate entries. So if it's still running, it hasn't found > > > > > > any. > > > > > > > > > > > > To avoid the use of the hashing feature, try entering "set hash > > > > > > off" > > > > > > before kicking off the command. But of course if it finds any, it > > > > > > will loop forever. > > > > > > > > > > > > > > > > Ah ok yeah I forgot about the built-in list loop detection! > > > > > > > > For a storage-less method of list loop-detection: run two walkers > > > > down the list, advancing two versus one elements. If you ever > > > > match the same element location after starting, you have a loop. > > > > > > I agree some algorithm [1] without a hash table may be better > > > especially for larger lists. > > > > I'll await your patch... > > > > Do you see any advantage to keeping the hash table for loop detection > or would you accept a patch that removes it completely in favor of a > another algorithm? For maintenance sake, it's probably worth keeping the hash queue option in place, primarily since there are a few dozen other internal facilities besides the user "list" command that use the do_list() function. Dave > > > -- > Crash-utility mailing list > Crash-utility@redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility > -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
On Tue, 2018-06-26 at 11:27 -0400, Dave Anderson wrote: > > - Original Message - > > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > > > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > > > > Yes, by default all list entries encountered are put in the built-in > > > > > hash queue, specifically for the purpose of determining whether there > > > > > are duplicate entries. So if it's still running, it hasn't found any. > > > > > > > > > > To avoid the use of the hashing feature, try entering "set hash off" > > > > > before kicking off the command. But of course if it finds any, it > > > > > will loop forever. > > > > > > > > > > > > > Ah ok yeah I forgot about the built-in list loop detection! > > > > > > For a storage-less method of list loop-detection: run two walkers > > > down the list, advancing two versus one elements. If you ever > > > match the same element location after starting, you have a loop. > > > > I agree some algorithm [1] without a hash table may be better > > especially for larger lists. > > I'll await your patch... > Do you see any advantage to keeping the hash table for loop detection or would you accept a patch that removes it completely in favor of a another algorithm? -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
- Original Message - > On Tue, 2018-06-26 at 15:34 +0100, Jeremy Harris wrote: > > On 06/26/2018 03:29 PM, David Wysochanski wrote: > > > On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > > > Yes, by default all list entries encountered are put in the built-in > > > > hash queue, specifically for the purpose of determining whether there > > > > are duplicate entries. So if it's still running, it hasn't found any. > > > > > > > > To avoid the use of the hashing feature, try entering "set hash off" > > > > before kicking off the command. But of course if it finds any, it > > > > will loop forever. > > > > > > > > > > Ah ok yeah I forgot about the built-in list loop detection! > > > > For a storage-less method of list loop-detection: run two walkers > > down the list, advancing two versus one elements. If you ever > > match the same element location after starting, you have a loop. > > I agree some algorithm [1] without a hash table may be better > especially for larger lists. I'll await your patch... > > I also found that ctrl-c of the very long running crash list command > did not release the hash table memory - I had to exit crash for that. Right, just as in any case where malloc() is used, the freed memory is not removed the user address space, but remains for subsequent allocations. Dave > > [1] https://en.wikipedia.org/wiki/Cycle_detection > > -- > Crash-utility mailing list > Crash-utility@redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility > -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
On Tue, 2018-06-26 at 09:21 -0400, Dave Anderson wrote: > > - Original Message - > > Hi Dave, > > > > We have a fairly large vmcore (around 250GB) that has a very long kmem > > cache we are trying to determine whether a loop exists in it. The list > > has literally billions of entries. Before you roll your eyes hear me > > out. > > > > Just running the following command > > crash> list -H 0x8ac03c81fc28 > list-yeller.txt > > > > Seems to increase the memory of crash usage over time very > > significantly, to the point that we have the following with top output: > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > 25522 dwysocha 20 0 11.2g 10g 5228 R 97.8 17.5 1106:34 crash > > > > > > When I started the > > command yesterday it was adding around 4 million entries to the file > > per minute. At the time I estimated the command would finish in around > > 10 hours and I could use it to determine if there was a loop in the > > list or not. But today has slowed down to less than 1/10th that, to > > around 300k entries per minute. > > > > Is this type of memory usage with list enumeration expected or not? > > > > I have not yet begun to delve into the code, but figured you might have > > a gut feel whether this is expected and fixable or not. > > Yes, by default all list entries encountered are put in the built-in > hash queue, specifically for the purpose of determining whether there > are duplicate entries. So if it's still running, it hasn't found any. > > To avoid the use of the hashing feature, try entering "set hash off" > before kicking off the command. But of course if it finds any, it > will loop forever. > Ah ok yeah I forgot about the built-in list loop detection! Probably if we increase the value of --hash when we start crash maybe we can keep a constant rate of additions to the file and it may finish in a reasonable amount of time. Any recommendations on sizing that parameter? Then again I guess if the total list is larger than RAM we may get into swapping. > Dave > > > > > > Thanks. > > > > -- > > Crash-utility mailing list > > Crash-utility@redhat.com > > https://www.redhat.com/mailman/listinfo/crash-utility > > -- > Crash-utility mailing list > Crash-utility@redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
Re: [Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
- Original Message - > Hi Dave, > > We have a fairly large vmcore (around 250GB) that has a very long kmem > cache we are trying to determine whether a loop exists in it. The list > has literally billions of entries. Before you roll your eyes hear me > out. > > Just running the following command > crash> list -H 0x8ac03c81fc28 > list-yeller.txt > > Seems to increase the memory of crash usage over time very > significantly, to the point that we have the following with top output: > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 25522 dwysocha 20 0 11.2g 10g 5228 R 97.8 17.5 1106:34 crash > > > When I started the > command yesterday it was adding around 4 million entries to the file > per minute. At the time I estimated the command would finish in around > 10 hours and I could use it to determine if there was a loop in the > list or not. But today has slowed down to less than 1/10th that, to > around 300k entries per minute. > > Is this type of memory usage with list enumeration expected or not? > > I have not yet begun to delve into the code, but figured you might have > a gut feel whether this is expected and fixable or not. Yes, by default all list entries encountered are put in the built-in hash queue, specifically for the purpose of determining whether there are duplicate entries. So if it's still running, it hasn't found any. To avoid the use of the hashing feature, try entering "set hash off" before kicking off the command. But of course if it finds any, it will loop forever. Dave > > Thanks. > > -- > Crash-utility mailing list > Crash-utility@redhat.com > https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility
[Crash-utility] crash-7.3.2 very long list iteration progressively increasing memory usage
Hi Dave, We have a fairly large vmcore (around 250GB) that has a very long kmem cache we are trying to determine whether a loop exists in it. The list has literally billions of entries. Before you roll your eyes hear me out. Just running the following command crash> list -H 0x8ac03c81fc28 > list-yeller.txt Seems to increase the memory of crash usage over time very significantly, to the point that we have the following with top output: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 25522 dwysocha 20 0 11.2g 10g 5228 R 97.8 17.5 1106:34 crash When I started the command yesterday it was adding around 4 million entries to the file per minute. At the time I estimated the command would finish in around 10 hours and I could use it to determine if there was a loop in the list or not. But today has slowed down to less than 1/10th that, to around 300k entries per minute. Is this type of memory usage with list enumeration expected or not? I have not yet begun to delve into the code, but figured you might have a gut feel whether this is expected and fixable or not. Thanks. -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility