Re: Hmm....is a hot directory possible?

david Thu, 26 Jun 2003 12:53:17 -0700

Chris Zimmerman wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Is there some way that I can write a bit of code that will watch a
> directory and as soon as a file is written to that directory, something is
> run against
> that file?  What would be the best way to turn this into a daemon?
>


you could take a look at the stat function provided by Perl to see if the 
directory's last modified time or inode change time changed:

#!/usr/bin/perl -w
use strict;

my($pmm,$pic);

while(1){

        my($mm,$ic) = (stat('/tmp'))[9,10];

        if($pmm and $pic and $pmm != $mm || $pic != $ic){
                print "some change to /tmp\n";
        }else{
                print ".\n";
        }

        $pmm=$mm;
        $pic=$ic;

        sleep(7);
}

__END__

1. this is not a daemon.
2. this only reports there are some changes (it could be adding a file or 
deleting a file,etc) in /tmp but you don't know what really happened there.
3. this only reports changes to /tmp not knowing any change below the /tmp 
level. for example:

#--
#-- the scrpipt report the following 3 changes to /tmp
#--
mkdir /tmp/another
touch /tmp/hi
rm -f /tmp/hi

#--
#-- but doesn't know the following 3 changes
#--
mkdir /tmp/another/yet
touch /tmp/another/yet/file
rm -fr /tmp/another/yet

because there is really no change to /tmp, only its child directory.

solution to #2 and #3 can do done with a different approch. something like 
the following might work:

1. recursively cache (in a hash) all sub directories and files under /tmp 
during start up of your daemon.
2. once a while, do the same resursive scan for the /tmp directory and 
compare the directory content with the hash you cached a while ago.
3. if there is any differences, you know something has changed and because 
you have 2 hashs, you can easily find out what really happened. for 
example, if the first hash has an entry where the second hash doesn't, you 
know something has been deleted from the directory. Or if the second hash 
has something that's missing from the first hash, you know there are new 
files.
4. update the cache to be the most recent scan. repeat step #2.

File::Find module can help you do the recursive scan portion fairly easily. 
you can take the same approach but instead of caching the directory 
contents, you can cache each sub directory's last modified time instead. 
this will reduce the size of your hash a bit. either way, even this 
approach has many drawback:

1. if you target directory is huge, the scaning part will take a long time 
which brings us to the 2 drawback.
2. race condition. it's totally possible for a file to appear and disappear 
during your directory scan especially if the scan takes a long time or the 
directory is "busy" (means there are tons of activity in the directory so 
files appear and disappear really fast). your scan will miss those.

you might need to apply some kind of locking to the directory during the 
scan. finally, take a look in CPAN to see if something comes up. on top of 
my head, i don't remember any modules that does what you want. good luck.

david

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Hmm....is a hot directory possible?

Reply via email to