Hi,
We seem to be suffering from a bug in the (trigger) code that updates
the ACCT_STAT.sz0 counter, in particular if an existing file is updated
while/to having zero content.
Try:
> touch /lustre_fs/somefile
> rbh-report --user-info -S --szprof
> echo "test" > /lustre_fs/somefile
> rbh-report --user-info -S --szprof
> truncate -s 0 /lustre_fs/somefile
> rbh-report --user-info -S --szprof
and note that in the last report, the sz0 counter for the active
user/group is decreased instead of increased.
If the counter drops below zero, robinhood starts throwing "Unhandled
error 1264: default conversion to DB_REQUEST_FAILED" errors which
prevent the database from being updated, resulting in mismatches between
rbh and lustre metadate (this was how the problem first became apparent
to us).
Taking a closer look at the UPDATE trigger code in the database, the
following line:
sz0=CAST(sz0 as SIGNED)-CAST(((OLD.size=0)+(NEW.size=0)) as SIGNED),
should probably be something like:
sz0=CAST(sz0 as SIGNED)-CAST((OLD.size=0) as SIGNED + CAST((NEW.size=0)
as SIGNED),
Cheers,
Hanno
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support