System cores find issue with failing hardware and kernel bugs not just experimental hardware. In my environment any outage or crash has to have a root cause analysis. With high volume trading systems, and financial transaction systems. we simply can't reboot and move on. Every effort must be given to finding what caused a failure and repair it if it was hardware, or patch it if it was software. Most issues are application related, and root cause can be found in application or system logs or cores. But with 10,000 systems we find daily hardware failures that would be impossible to find without system cores. "Rare" is only meaningful if you are talking about a small number of systems. The more systems you have the less rare anything is. We replace bad RAM, CPU and Disks daily. All would be very hard to find with out good cores dumps.
But anyone that has a system that must stay up who needs maximum uptime or deals with valuable data should configure there systems to take system cores on crash. Even if you don't have a vendor contract. You can pay for core analysis on a single incident with most vendors. On Mar 3, 2013, at 12:09 AM, Tilghman Lesher wrote: > On Sat, Mar 2, 2013 at 8:17 PM, James Sizemore <[email protected]> wrote: >> First a little history, I manage 10,000 Unix systems (as part of the >> Break/Fix group) for a large international bank. So I am a little sensitive >> to lack of core dumps or failed core dumps. As not having them makes my job >> a pain. About a third of these are Linux. The problem with using common >> disk space for core dumps would be that if root or var were full that common >> space would not be available for dumping the core. So to guaranty a good >> cores you either need an unused partition of at least RAM size that does >> nothing useful 99.99% of the time or a swap partition of at least RAM size >> that could at least be useful for buying time in a case of a memory leak. >> You still need to find space to save the core if root or var are full but >> you still have the core. And either way you have some amount of disk space >> used up that is equal to the size or ram. Why not have it as swap? > > Because we're talking about system cores, not process cores. System > kernel dumps are quite different and, unless you're using experimental > hardware, are quite rare. > > -Tilghman > > -- > -- > You received this message because you are subscribed to the Google Groups > "NLUG" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nlug-talk?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "NLUG" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > -- -- You received this message because you are subscribed to the Google Groups "NLUG" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nlug-talk?hl=en --- You received this message because you are subscribed to the Google Groups "NLUG" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
