On Mon, Sep 26, 2005 at 11:38:55PM +0200, Jean-Claude Wippler wrote:
> Jack Diederich wrote:
> 
> >The reason I was trying to upgrade to 2.4.9.4 was that I have a  
> >corrupt
> >file generated by 2.4.9.3 and I was hoping 2.4.9.4 could open it.
> >
> >import metakit
> >stor = metakit.storage('6051280980.mk', 2) # RW + locking
> >print "about to getas"
> >vw = stor.getas('session 
> >[id:I,risitor_id:S,session_id:I,page_view_count:I,loadtime_average:F,l 
> >oadtime_count:I,pagetime_average:F,pagetime_count:I,entry_page:S,exit_ 
> >page:S,referer:S,banner_id:I,start_time:S,end_time:S,screen_dims:S,col 
> >or_depth:S,cookies:S,sales:F,agent:S,flash:S,wmp:S,pdf:S,quicktime:S,r 
> >ealaudio:S,os:S,browser:S,browser_ver:S,host_ip:S,host_name:S,hit 
> >[id:I,risitor_id:S,rage_id:S,session_id:I,referer:S,real_referer:S,loa 
> >dtime:F,pagetime:F,created:S,updated:S,browser_width:I,browser_height: 
> >I,exit_page:S]]')
> >print "never reached"
> >
> >CPU jumps up to 100% and will go for hours if I let it.
> >I put a copy of the file here http://demo.performancedrivers.com/ 
> >6051280980.mk
> >(214k).  I'll leave it up there for a couple days and then remove it.
> >
> >The file was created via python in a mod_python apache environment.
> >I have millions of metakit files and some small percentage of them are
> >corrupt like this (I get a few CPU pegged processes a week).
> 
> There is something very fishy with that file... I find over 80 copies  
> of the "4A 4C" hex bytes which denote the start of a MK datafile  
> section (followed by more bytes which sure look like a header to  
> me).  As if opening a file failed, and MK created a fresh MK tail  
> each time.
> 
> This could happen if there is something wrong with the commit,  
> damaging the tail of the file written, so the file data does not get  
> recognized as valid data in the next open.  My only explanation for  
> this is having the file open for writing more than once at the same  
> time - which is a big no-no in MK.  Are you locking against multiple  
> opens?  Web servers and especially CGI's have a habit of getting  
> fired more than once, as part of net retries for example.
> 
> MK 2.4.9.4 can't open such a corrupted file either, btw.
> 

Nuts, I just checked and it is possible for one uncommon code path
to open the file without holding the database lock for a particular
person (each web visitor gets their own MK file for history).  That
is why I don't see the corruption very often.

Any suggestions for finding corrupt files?  The only thing I can
think of offhand is trying to open them all and killing the process
if it hangs for too long (and then deleting the offending file).

Thanks for the help,

-Jack
_____________________________________________
Metakit mailing list  -  Metakit@equi4.com
http://www.equi4.com/mailman/listinfo/metakit

Reply via email to