Hello, all or part of the output of lfs changelog would be helpful too.
Do you have in your logs a line like: "warning: llapi_changelog_start() called w/o CHANGELOG_FLAG_JOBID"? -- Henri Doreau Le 16/06/2015 16:03, LEIBOVICI Thomas a écrit : > I understood your servers run Lustre 2.7. > > Does robinhood run on a Lustre 2.7 client or an older version ? > Did you build robinhood on a host where Lustre 2.7 is installed or an > older version? > Could it be possible that robinhood is currently processing changelog > records that were generated while you was still running older Lustre (2.5?) > > > Also, it would be helpful to run "gdb" on a binary that include debug > information, basically from the source tree: > ./configure > gdb --args ./src/robinhood/robinhood --readlog > > run > > Thanks. > > > On 06/15/15 19:50, Frederik Ferner wrote: >> All, >> >> while upgrading our Lustre file system to Lustre 2.7, I also upgraded >> robinhood to the newly released 2.5.5. I did download the tar file and >> compiled it locally as the pre-built rpms on sourceforge have a >> dependency on lustre-modules but on our site the rpm provides >> lustre-client-modules. >> >> The RPM installed fine, the server is running Lustre 2.7 but with the >> same configuration that previously was running fine (on 2.5.4) the new >> version now segfaults on startup (called as robinhood --read-log). I'm >> currently not sure how to debug this further. Any pointers welcome, >> strace wasn't helpful in determining where it crashes, the log isn't >> that clear either, with normal options the following are the only lines >> in the logfile: >> >> <snip> >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/1] CheckFS | >> '/mnt/lustre03' matches mount point '/mnt/lustre03', type=lustre, >> fs=cs04r-sc-mds03-01-10ge@tcp:cs04r-sc-mds03-02-10ge@tcp:/lustre03 >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/2] SigHdlr | >> Signals SIGTERM and SIGINT (daemon shutdown) are ready to be used >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/2] SigHdlr | >> Signal SIGHUP (config reloading) is ready to be used >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/2] SigHdlr | >> Signal SIGUSR1 (stats dump) is ready to be used >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/1] EntryProc | No >> class defined in policies, disabling file class matching. >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/1] EntryProc | No >> class defined in policies, disabling dir class matching. >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/1] Main | Daemon >> started (running modules: log_reader) >> 2015/06/15 18:27:36 robinhood@cs04r-sc-serv-92[103117/3] ChangeLog | >> LU-1331 is fixed in this version of Lustre. >> </snip> >> >> With --log-level=DEBUG there are quite a few lines like this following >> before it just stops: >> >> <snip> >> 2015/06/15 18:46:19 robinhood@cs04r-sc-serv-92[110019/11] ChangeLog | >> MDT0000: 3435143616 14SATTR 1434376871.895757017 0x14 >> t=[0x20001026e:0x86f9:0x0] >> 2015/06/15 18:46:19 robinhood@cs04r-sc-serv-92[110019/11] ChangeLog | >> MDT0000: 3435143617 08RENME 1434376871.897757063 0x1 >> t=[0x20000fa28:0xb3ef:0x0] p=[0xecedd2c:0x52385992:0x0] >> LineScan$py.class s=[0x20001026e:0x86f9:0x0] >> sp=[0xecedd2c:0x52385992:0x0] .LineScan$py.class.3FMhca >> </snip> >> >> >> Cheers, >> Frederik > > ------------------------------------------------------------------------------ > _______________________________________________ > robinhood-support mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/robinhood-support ------------------------------------------------------------------------------ _______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
