Re: [analog-help] Saving time on DNS lookups

Jason Linhart Thu, 29 Mar 2001 20:40:40 -0800
On 3/29/01 12:17 AM Sean Straw / PSE ([EMAIL PROTECTED]) wrote:

>I don't think you're on the same wavelength as the concept was broadcast 
>on.  The separate utility would STILL be a separate utility, except for the 
>lookup functionality, which would be a plug-in or library of sorts for 
>Analog.  The wholesale DNS resolution would STILL be run *BEFORE* Analog 
>actually does analysis (not for each individual log item) - though 
>inherently by design, it COULD be performed dynamically, but that's not the 
>idea.  Analog would call into the cache lookup function for resolution, 
>which allows the lookup/resolution function to manipulate cache files, 
>tables, regexps, or real lookups any way it pleases without interfering 
>with Analog.

Right, I misunderstood half of what you were saying, although that 
doesn't change my conclusions much.

>>The DNS cache file is a flat sequential file, but it gets loaded into
>>memory only once, where it becomes an optimized hash table.
>
>Ah, it wasn't described that way - it sounded very much like a sequential 
>search through the cache file for each resolution, which would definatley 
>explain why it would be so tremendously slower.

The original description was wrong. Analog reads the entire cache file 
once sequentially as it is starting up, and stores it internally as an 
optimized hash table.

>In any event, having a resolution interface of some sort would permit for 
>external development of improvements to how the DNS data is resolved (such 
>as netblocks), since analog would just pass the IP of the host it wanted a 
>hostname for, and the external function would come back with an answer -- 
>however it arrives at that -- a HUGE table, a series of regexps, or a 
>netblock list, whatever - it returns the result.
>
>>Keeping file formats simple is a big advantage, but Analog doesn't let 
>>that constrain what it does internally.
>
>Muy suggestion is to allow Analog to *NOT CARE* about the format of the DNS 
>cache file, by externalizing it's functionality.

If one wanted, one could write a lookup helper that kept it's own 
database, in any possible format, and then wrote just the entries that 
were necessary for the current run to the Analog dnscache file. This 
would have dramatically less CPU overhead than a cross process message 
for each entry, and doesn't require any changes to Analog. Memory usage 
is almost the same either way, in Analog, if the dnscache file is 
pre-optimized to only contain needed entries. The lookup helper needs to 
read the log file anyway, so it already knows which entries need to go in 
the dnscache file.

To sum up:

1) Think of the dnscache file as a method of communication between the 
helper and Analog, not as a database or long term storage system. The 
dnscache file is the most CPU efficient way I know of to communicate the 
needed information between the helper and Analog. The CPU overhead of a 
cross process message for each name would likely be very very high.

2) There can be some memory overhead for communicating though a dnscache 
file, as opposed to calling the helper for each name, but that can be 
minimized. The first step of getting the memory overhead down can be done 
by having the helper write a dnscache file that only contains the names 
it needs to contain. Helpers currently have a habit of storing alot of 
extra names, that might be useful someday, in the dnscache file, but they 
could just as well keep the extra names somewhere else.

3) There is a potential simplicity advantage to having Analog tell the 
helper which files it is processing each run, through a system such as 
you describe. The alternative is either to have the helper read Analog 
config files, or to have a script that tells both programs which logs to 
read (each in it's own format). Both alternatives are simple enough that 
the programming in Analog for the new approach doesn't seem justified to 
me.

Jason

-----------------
[EMAIL PROTECTED]
-----------------
Dr. Seuss books . . . can be read and enjoyed on several levels. For
example, 'One Fish Two Fish, Red Fish Blue Fish' can be deconstructed
as a searing indictment of the narrow-minded binary counting system.
  -- Peter van der Linden, Expert C Programming, Deep C Secrets


+------------------------------------------------------------------------
|  This is the analog-help mailing list. To unsubscribe from this
|  mailing list, go to
|    http://lists.isite.net/listgate/analog-help/unsubscribe.html
|
|  List archives are available at
|    http://www.mail-archive.com/[email protected]/
|    http://lists.isite.net/listgate/analog-help/archives/
+------------------------------------------------------------------------
Re: [analog-help] Saving time on DNS lookups

Reply via email to