Re: Debugging idea for a constrained situation

Jonah Benton Thu, 21 Jan 2016 16:17:34 -0800

If you're in a place where you're hunting for major architectural patterns
to consider in what could amount to a rewrite, well, certainly there is no
free lunch. But the pattern you mention- capturing external event history-
is in the vein of Event Source systems


http://martinfowler.com/eaaDev/EventSourcing.html

Those sometimes go along with a Command pattern

https://en.wikipedia.org/wiki/Command_pattern

for applying changes to state.

The full context of your system isn't clear, but if the event source
abstraction applies, and events can be persisted in some way that you can
have access to, then it could help in recreating problematic conditions.
The one suggestion I'd make if possible is to capture the events in a
facility that is outside your program. If you do it inside your program, as
you were describing- whatever the potential benefit, now you're giving your
program yet another thing to do, the complexity (and bugs) in which is not
zero.

Along the same lines are the concepts of decomposition and decomplecting.
Prefer more simple things, loosely coupled, than a smaller number of more
complex tightly coupled things. If possible, rather than adding things to
one program, split the program into multiple smaller programs that have
precisely defined and limited interfaces. Some motivation in this vein are
Dan Bernstein's 7 fundamental rules:

https://cr.yp.to/qmail/guarantee.html

and paper about the security of qmail:

http://cr.yp.to/qmail/qmailsec-20071101.pdf

Hope that's helpful. Good luck.




On Thu, Jan 21, 2016 at 1:45 PM, <fah...@gmail.com> wrote:

> While not strictly Clojure-related, I thought I'd share this idea with you
> here because (1) I came up with it while thinking about design from a
> Clojure / functional point of view and (2) I respect your opinion. It's
> very likely you'll have better ideas...
>
> *I'm in an highly constrained situation*
>
>    - When an Incident occurs (possible bug, bad behavior), I'm told days
>    later and I never have access to the machine where the code ran
>    - I only have the following
>       - a log file with size constraints (no more than n KB)
>       - the version of the code that was running
>       - I don't have the following
>       - core dumps (they'd be bigger than the n KB anyway, plus no one
>       knows a priori when to persist one for Incidents where the code fully
>       believed it was doing fine)
>       - complete info on how to re-create the environment (only partial
>       info)
>       - since the code can be running on any kind of machine with any
>          kind of configuration
>          - and since there are a lot of other applications of various
>          versions running as well
>          - besides, even if I had complete info, actually re-creating
>          such an environment would be very time consuming and error prone
>
> Figuring out what went wrong has been *painful*.
>
>
> But if I had access to all the values that a program *obtained/received*
> from its environment leading up to the Incident then I could just have my
> program use these values while running in a debugger.
>
>
> *The basic idea is  *
>
>    1. Log *external* values used by the program over time in production.
>    Don't worry about internal / local values since they are all derived
>    functionally from these external values.
>    2. When an Incident occurs, load this log and a final time-stamp into
>    the program's "state map"
>    3. Any time the program needs a value from the outside, it uses a
>    value from the state map instead
>    4. Set breakpoints and debug away (I'm stuck using C++ (sadness!))
>
> I like this because minimal time is spent re-creating the crime scene. I
> just have to tweak the program to start the task / thread in question after
> it's done loading the state. I won't have to ask QA "do you have a test
> environment where this problem is reproducible?" And I won't be making any
> mistakes in reproducing the Incident because all the values used will be
> loaded in an automated fashion.
>
>
> *Considerations*
>
>    - Since values may be large, I may have to tweak the logging to enable
>    re-using a value from earlier if it hasn't changed instead of logging it
>    all over again. (current / expired / re-use)
>       - The program would be checking if the value has changed for those
>       values which have "expired" (that is, values expire if the task they're
>       related to has finished -- when that task starts up again the program 
> would
>       check the map for a value that it needs, find that it has expired, and 
> go
>       fetch the current one from the environment. Then it can decide how to 
> log
>       it in the state log.)
>       - I have to make sure every value I need would be within the most
>    recent n KB of log. I may have a separate thread that logs a snapshot of
>    the entire state every n KB.
>    - I'm forced to change "every" external access into a conditional that
>    checks the state map first.
>    - sections of code can always opt out of this as long as
>       - I don't think I'll need to debug it esp. if it's been working
>          fine for months
>          - it's basically separate from the rest of the code (i.e. it
>          won't be involved in re-creating any Incidents in other code)
>       - State maps make the code more test-able since I can make the
>    program "see" any kind of arbitrary weirdness.
>
> I'm very interested to know what you think of this. It does smell
> heavy-handed to me -- but having something like it would alleviate a ton of
> pain... It could be worth it. Thanks in advance for any feedback.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Debugging idea for a constrained situation

Reply via email to