Hi, 

We seem to be suffering from a bug in the (trigger) code that updates
the ACCT_STAT.sz0 counter, in particular if an existing file is updated
while/to having zero content. 

Try: 

> touch /lustre_fs/somefile 
> rbh-report --user-info -S --szprof 
> echo "test" > /lustre_fs/somefile 
> rbh-report --user-info -S --szprof 
> truncate -s 0 /lustre_fs/somefile 
> rbh-report --user-info -S --szprof 

and note that in the last report, the sz0 counter for the active
user/group is decreased instead of increased. 

If the counter drops below zero, robinhood starts throwing "Unhandled
error 1264: default conversion to DB_REQUEST_FAILED" errors which
prevent the database from being updated, resulting in mismatches between
rbh and lustre metadate (this was how the problem first became apparent
to us). 

Taking a closer look at the UPDATE trigger code in the database, the
following line: 

sz0=CAST(sz0 as SIGNED)-CAST(((OLD.size=0)+(NEW.size=0)) as SIGNED), 

should probably be something like: 

sz0=CAST(sz0 as SIGNED)-CAST((OLD.size=0) as SIGNED + CAST((NEW.size=0)
as SIGNED), 

Cheers, 
Hanno 
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to