Well, I believe I've solved the problem. Hieu, I believe that the fix
you describe is exactly what my friend did to fix his problem.

To fix mine, I started by recompiling, turning off threading, since I
wasn't using it. That got me past the initial problem, but moses still
died sometimes.

Following Ken's advice, I tracked down the error log, and when it died
it complained about not being able to find GLIBCXX_3.4.9 or
GLIBCXX_3.4.10 or GLIBCXX_3.4.11 in /usr/lib64/libstdc++.so.6.

I ssh'd in to the machine where that job had run and died, and ran the
following to check:

strings /usr/lib64/libstdc++.so.6 | grep GLIBCXX

Sure enough, the latest version provided there was GLIBCX_3.4.8. But,
when I ran the same command on the machine where I'd compiled Moses,
the later versions also were available. That tipped me off that
something was wrong. I checked the OS versions, and it turns out that
there are a handful of machines on our local grid that are still
CentOS 5.5. I had compiled under Scientific Linux 6, and most of the
machines on the grid have been upgraded to SL6.

So it appears that the problem was that I had compiled moses against a
newer version of glibc++, and when I ran on the older distro that had
an older glibc++, it would die because it expected the newer version
to be there and it wasn't.

So for the moment my solution is to restrict my moses jobs to only run
on the machines with the newer distro installed.

Also, for anyone looking at this later, FWIW I'm running an older
version of moses, and not the current master. Not that that probably
matters in this case.

Cheers,
Lane


On Tue, Jan 10, 2012 at 7:32 AM, Hieu Hoang <[email protected]> wrote:
> ah, if it translates everything THEN segfault, it's likely to be a
> double-delete in 1 of the destructors.
>
> your friend might have added this macro
>    EXIT_RETURN
> which basically just avoids the destructors (Main.cpp line 501)
>
> however, it'll be good to know where it blows up and craft the destructors
> properly
>
>
> On Tue, Jan 10, 2012 at 7:17 PM, Lane Schwartz <[email protected]> wrote:
>>
>> No, I'm using plain text phrase tables and plain text language model
>> files.
>>
>> On Tue, Jan 10, 2012 at 6:48 AM, Hieu Hoang <[email protected]>
>> wrote:
>> > hey lane
>> >
>> > are you using binary kenlm files that was binarized previously?
>> >
>> > I think they're not compatible across gcc versions, until a recent
>> > change
>> > ken made. Due to some kinda #pragma memory alignment thingy apparently
>> >
>> > On Tue, Jan 10, 2012 at 4:06 AM, Lane Schwartz <[email protected]>
>> > wrote:
>> >>
>> >> After upgrading from CentOS 5.5 to Scientific Linux 6, I've
>> >> encountered some weird behavior.
>> >>
>> >> When I run moses, it successfully translates all of the sentences, but
>> >> then it (sometimes) segfaults. It doesn't segfault all the time,
>> >> though. One of the other guys in my office says he had this problem,
>> >> and figured out a simple fix for it, but unfortunately he doesn't
>> >> remember what the fix was.
>> >>
>> >> Has anyone else seen anything like this?
>> >>
>> >> Thanks,
>> >> Lane
>> >> _______________________________________________
>> >> Moses-support mailing list
>> >> [email protected]
>> >> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>>
>>
>>
>> --
>> When a place gets crowded enough to require ID's, social collapse is not
>> far away.  It is time to go elsewhere.  The best thing about space travel
>> is that it made it possible to go elsewhere.
>>                 -- R.A. Heinlein, "Time Enough For Love"
>
>



-- 
When a place gets crowded enough to require ID's, social collapse is not
far away.  It is time to go elsewhere.  The best thing about space travel
is that it made it possible to go elsewhere.
                -- R.A. Heinlein, "Time Enough For Love"

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to