Re: [Moses-support] GIZA++: glibc detected (Angelina Ivanova)

Joerg Tiedemann Fri, 22 Jul 2011 10:47:12 -0700

I had a similar problem with g++ 4.4 (Giza++ crashed on some smaller
data sets). I found this
http://permalink.gmane.org/gmane.comp.nlp.moses.user/4079
and reverting to 4.1 removed the problem.


There is also a comment
http://comments.gmane.org/gmane.comp.nlp.moses.user/4079
with a different solution.

I hope this helps,
Jörg


On Fri, Jul 22, 2011 at 7:09 PM, Angelina Ivanova <[email protected]> wrote:
> Hello!
> Thank you for the fast reply. Yes, I saw some links on GIZA++, but I
> didn't find a solution or the hint what can cause this error.
>
> My environment is:
> #62 UBUNTU 2.6.32-32-generic-pae
> Moses Built on Jan 28 2009
> gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)
> giza-pp-v1.0.2
>
> However, I can run Moses successfully on the other data.
>
>
>
> On Fri, Jul 22, 2011 at 6:34 PM, Thu Vuong Hoai <[email protected]> wrote:
>> Hello,
>> I found your error in the issues page of Giza++, could you please check this
>> link http://code.google.com/p/giza-pp/issues/detail?id=15, I've thought it's
>> not enough good for you but I want to ask about issue 11, do you fix it? and
>> could you plz, provide more information about your environment?
>> On Fri, Jul 22, 2011 at 11:04 PM, <[email protected]> wrote:
>>>
>>> Send Moses-support mailing list submissions to
>>>        [email protected]
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>        http://mailman.mit.edu/mailman/listinfo/moses-support
>>> or, via email, send a message with subject or body 'help' to
>>>        [email protected]
>>>
>>> You can reach the person managing the list at
>>>        [email protected]
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of Moses-support digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>>   1. Re: Using Moses language models (Barry Haddow)
>>>   2. Re: Using Moses language models (Marc LEGENDRE)
>>>   3. GIZA++: glibc detected (Angelina Ivanova)
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Fri, 22 Jul 2011 09:14:47 +0100
>>> From: Barry Haddow <[email protected]>
>>> Subject: Re: [Moses-support] Using Moses language models
>>> To: [email protected], [email protected]
>>> Message-ID: <[email protected]>
>>> Content-Type: text/plain;  charset="utf-8"
>>>
>>> On Friday 22 July 2011 03:50, Hieu Hoang wrote:
>>> > true, & there's no right answer to it.
>>> >
>>> > I suppose 1 goal of the trunk is to make sure that the core
>>> > functionality
>>> > of translating isn't affected too much, in terms of quality, speed, or
>>> > memory. ANother goal is to make not to overburden the API with things
>>> > no-one else uses or implement.
>>> >
>>> > therefore, i think a good strategy is to branch & do what you like
>>> >
>>>
>>> Hi Hieu
>>>
>>> I'm not sure I see the point of implementing this in a branch and never
>>> merging. That's not a branch, it's a fork. The point of doing a small
>>> change
>>> like this in a branch would be so that the LM interface experts (ie you
>>> and
>>> Ken and ...) could have a look at it before it gets merged in.
>>>
>>> As regards how to implement the interface changes, what would be the
>>> consequences of having other LM implementations throw an exception or an
>>> assert for ngram_length? I think returning -1 is a very bad idea,
>>> especially
>>> as the return value is probably a size_t, and returning 0 could also lead
>>> to
>>> subtle and confusing behaviour. However if there is a return value with
>>> the
>>> semantics of "don't know" then that would be the ideal solution.
>>>
>>> cheers - Barry
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> Message: 2
>>> Date: Fri, 22 Jul 2011 10:21:44 +0200 (CEST)
>>> From: Marc LEGENDRE <[email protected]>
>>> Subject: Re: [Moses-support] Using Moses language models
>>> To: [email protected]
>>> Cc: [email protected]
>>> Message-ID:
>>>        <[email protected]>
>>> Content-Type: text/plain; charset=ISO-8859-15
>>>
>>> Well, we (me and the people I work with) were hoping not to have to
>>> maintain
>>> a modified version of Moses.
>>>
>>> Luckily, obviousness just hit me like a truck : if something is specific
>>> to a LM,
>>> it does not have to be in the top layer.
>>> Having a common interface does not prevent subclasses from having a
>>> specific behaviour,
>>> we could have a LanguageModelKen method, say GetValueForgotStateKen(...)
>>> which would return
>>> something specific, say a LMKenResult, which would contain a LMResult plus
>>> others things
>>> like, say, a ngram_length field :-).
>>> And the virtual GetValueForgotState() method would simply return the
>>> LMResult from there.
>>>
>>> This way, no need to break the high level API,
>>> and no extra maintenance cost for us (me and the peop... Well, you know).
>>>
>>> ----- Mail original -----
>>> > De: "Hieu Hoang" <[email protected]>
>>> > ?: "Kenneth Heafield" <[email protected]>
>>> > Cc: [email protected]
>>> > Envoy?: Vendredi 22 Juillet 2011 04:50:14
>>> > Objet: Re: [Moses-support] Using Moses language models
>>> >
>>> >
>>> > true, & there's no right answer to it.
>>> >
>>> > I suppose 1 goal of the trunk is to make sure that the core
>>> > functionality of translating isn't affected too much, in terms of
>>> > quality, speed, or memory. ANother goal is to make not to overburden
>>> > the API with things no-one else uses or implement.
>>> >
>>> > therefore, i think a good strategy is to branch & do what you like
>>> >
>>> >
>>> > On 21 July 2011 22:46, Kenneth Heafield < [email protected] >
>>> > wrote:
>>> >
>>> >
>>> > Marc makes a good point. When one language model provides more
>>> > information than do other language models, it's difficult to maintain
>>> > a
>>> > common abstraction layer. Currently we're looking at n-gram length.
>>> > SRILM doesn't provide access to that (but you can get right-looking
>>> > state length which is usually the same thing).
>>> >
>>> > I'm working on making this issue more severe with left-looking state
>>> > optimization and explicit hypothesis bounds. How do we change the
>>> > decoder to use these features if not all of the language models
>>> > support
>>> > them?
>>> >
>>> > Maybe another class in the language model hierarchy supporting these
>>> > additional features. But it's going to make the decoder look ugly if
>>> > you want to support both.
>>> >
>>> >
>>> >
>>> >
>>> > On 07/21/11 11:14, Hieu Hoang wrote:
>>> > > hi marc,
>>> > >
>>> > > it'll be good for people to see your changes.
>>> > >
>>> > > i suppose you should create a branch and make your changes in
>>> > > there.
>>> > >
>>> > > If there are other people interested, you can point them to your
>>> > > branch.
>>> > > If more people are interested and it doesn't affect other people
>>> > > too
>>> > > much, then we can move it to trunk.
>>> > >
>>> > > i'll email you offline with svn details
>>> > >
>>> > > On 21/07/2011 15:16, Marc LEGENDRE wrote:
>>> > >> Alright, I gave this a try, and it did it for me.
>>> > >> With kenlm, it is a ridiculously straightforward modification,
>>> > >> but now I'm not sure how I can submit it :
>>> > >> on one hand, I am not a "machine tranlation guy" and I don't
>>> > >> imagine myself
>>> > >> digging in every other LM to find how to set the ngram_length
>>> > >> value;
>>> > >> and on the other hand I would feel guilty to submit a 10-line
>>> > >> patch and say
>>> > >> "Guys, I need this, would you mind committing it and doing
>>> > >> yourselves the
>>> > >> necessary modifications in every other wrapper ?"
>>> > >>
>>> > >> How do you, Moses developers, feel about this ?
>>> > >> Is it acceptable / outrageously stupid if I set the value to -1 in
>>> > >> the other wrappers,
>>> > >> maybe with a TODO, and properly document it in the super class ?
>>> > >>
>>> > >> ----- Mail original -----
>>> > >>> De: "Kenneth Heafield"< [email protected] >
>>> > >>> ?: [email protected]
>>> > >>> Envoy?: Mercredi 13 Juillet 2011 20:53:46
>>> > >>> Objet: Re: [Moses-support] Using Moses language models
>>> > >>>
>>> > >>> I'd suggest adding a ngram_length member to LMResult then
>>> > >>> modifying
>>> > >>> each
>>> > >>> model's wrapper (or just mine) to set that value.
>>> > >>>
>>> > >>> You're welcome to move stuff from LanguageModelKen.cpp to
>>> > >>> LanguageModelKen.h as necessary. I chose this setup to minimize
>>> > >>> unnecessary includes.
>>> > >>>
>>> > >>> Kenneth
>>> > >>>
>>> > >>> On 07/13/11 14:33, Marc LEGENDRE wrote:
>>> > >>>> Well, not only the header is not "public", so to speak, (which I
>>> > >>>> agree is not a major obstacle)
>>> > >>>> but also the desired pointer is a private member of the class,
>>> > >>>> and
>>> > >>>> sadly lacks a getter.
>>> > >>>> As far as I know, it means that accessing it will involve
>>> > >>>> questionnable C++ tricks.
>>> > >>>> (never tried, though)
>>> > >>>>
>>> > >>>> If modifying Moses is not too much of a chore, I'll give it a
>>> > >>>> thought.
>>> > >>>>
>>> > >>>> Anyway, thank you for your answers.
>>> > >>>>
>>> > >>>> ----- Mail original -----
>>> > >>>>> De: "Hieu Hoang"< [email protected] >
>>> > >>>>> ?: [email protected]
>>> > >>>>> Envoy?: Mercredi 13 Juillet 2011 18:40:11
>>> > >>>>> Objet: Re: [Moses-support] Using Moses language models
>>> > >>>>> i guess lm::Model is specific to the ken lm implementation. If
>>> > >>>>> you
>>> > >>>>> want
>>> > >>>>> use it you should include the header yourself and cast whatever
>>> > >>>>> you
>>> > >>>>> need
>>> > >>>>> to get the pointer.
>>> > >>>>>
>>> > >>>>> if you're feeling generous, maybe you can extend the moses LM
>>> > >>>>> wrapper
>>> > >>>>> so
>>> > >>>>> that all LM implementations have the opportunity to return the
>>> > >>>>> length
>>> > >>>>> n-gram match.
>>> > >>>>>
>>> > >>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote:
>>> > >>>>>> The length of the n-gram match is sufficient for I want,
>>> > >>>>>> indeed.
>>> > >>>>>> I figured out how to do get it using directly kenlm, but as I
>>> > >>>>>> am
>>> > >>>>>> running the decoder, I wanted to use the already loaded LM.
>>> > >>>>>>
>>> > >>>>>> I first tried to dig my way through the Moses abstraction
>>> > >>>>>> layers
>>> > >>>>>> to
>>> > >>>>>> retrieve a pointer to a lm::Model from kenlm, but the
>>> > >>>>>> Moses::LanguageModelKen header is not part of the public
>>> > >>>>>> headers
>>> > >>>>>> of
>>> > >>>>>> Moses ; that's why I tried to use only Moses interface.
>>> > >>>>>>
>>> > >>>>>> (I did I did not mention this alternative ; If someone knows
>>> > >>>>>> how
>>> > >>>>>> to
>>> > >>>>>> get such a pointer, I can carry on from there)
>>> > >>>>>>
>>> > >>>>>>
>>> > >>>>>> ----- Mail original -----
>>> > >>>>>>> De: "Kenneth Heafield"< [email protected] >
>>> > >>>>>>> ?: "Marc LEGENDRE"< [email protected] >
>>> > >>>>>>> Envoy?: Mercredi 13 Juillet 2011 16:12:27
>>> > >>>>>>> Objet: Re: [Moses-support] Using Moses language models
>>> > >>>>>>> The definition of unknown is that the word you asked for (the
>>> > >>>>>>> rightmost
>>> > >>>>>>> one) is mapped to<unk> i.e. an OOV.
>>> > >>>>>>>
>>> > >>>>>>> Are you looking for:
>>> > >>>>>>>
>>> > >>>>>>> 1) Length of n-gram matched in the model
>>> > >>>>>>>
>>> > >>>>>>> or
>>> > >>>>>>>
>>> > >>>>>>> 2) Length of state you must keep for valid continuation to
>>> > >>>>>>> the
>>> > >>>>>>> right
>>> > >>>>>>>
>>> > >>>>>>> These are slightly different things due to state
>>> > >>>>>>> minimization.
>>> > >>>>>>> The
>>> > >>>>>>> moses abstraction layer does not return either in a general
>>> > >>>>>>> way.
>>> > >>>>>>> However, if you're using KenLM, #2 is in the returned state's
>>> > >>>>>>> valid_length_. Further, #1 is in
>>> > >>>>>>> FullScoreReturn.ngram_length.
>>> > >>>>>>> So
>>> > >>>>>>> if
>>> > >>>>>>> you call KenLM directly these are easy to obtain (and you can
>>> > >>>>>>> decide
>>> > >>>>>>> whether to expose them through the Moses abstraction layer).
>>> > >>>>>>>
>>> > >>>>>>> Outside the decoder, you can run
>>> > >>>>>>>
>>> > >>>>>>> kenlm/query model_file null
>>> > >>>>>>>
>>> > >>>>>>> then provide your trigrams on stdin.
>>> > >>>>>>>
>>> > >>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa null
>>> > >>>>>>>
>>> > >>>>>>> looking on a
>>> > >>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513
>>> > >>>>>>> Total: -1.79818 OOV: 0
>>> > >>>>>>>
>>> > >>>>>>> The format is "word=vocab_id ngram_length score". So this is
>>> > >>>>>>> a
>>> > >>>>>>> trigram
>>> > >>>>>>> in the model because "a=5 3" appears.
>>> > >>>>>>>
>>> > >>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote:
>>> > >>>>>>>> Hello,
>>> > >>>>>>>>
>>> > >>>>>>>> I am trying to use the language models loaded by Moses ;
>>> > >>>>>>>>
>>> > >>>>>>>> I am using a 3-gram LM, and I need to know whether it
>>> > >>>>>>>> contains
>>> > >>>>>>>> a
>>> > >>>>>>>> given N-gram or not.
>>> > >>>>>>>> I tried to play around with
>>> > >>>>>>>> LanguageModelImplementation::GetValueForgotState(...),
>>> > >>>>>>>> but the boolean 'unknown' in the returned structure does not
>>> > >>>>>>>> seem
>>> > >>>>>>>> to
>>> > >>>>>>>> be what I'm looking for.
>>> > >>>>>>>>
>>> > >>>>>>>> Is there any simple way of getting this piece of information
>>> > >>>>>>>> ?
>>> > >>>>>>>>
>>> > >>>>>>>>
>>> > >>>>>>>> Regards,
>>> > >>>>>>>> Marc Legendre
>>> > >>>>>>>> _______________________________________________
>>> > >>>>>>>> Moses-support mailing list
>>> > >>>>>>>> [email protected]
>>> > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > >>>>>> _______________________________________________
>>> > >>>>>> Moses-support mailing list
>>> > >>>>>> [email protected]
>>> > >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > >>>>>>
>>> > >>>>>>
>>> > >>>>> _______________________________________________
>>> > >>>>> Moses-support mailing list
>>> > >>>>> [email protected]
>>> > >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > >>>> _______________________________________________
>>> > >>>> Moses-support mailing list
>>> > >>>> [email protected]
>>> > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > >>> _______________________________________________
>>> > >>> Moses-support mailing list
>>> > >>> [email protected]
>>> > >>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > >>>
>>> > >> _______________________________________________
>>> > >> Moses-support mailing list
>>> > >> [email protected]
>>> > >> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > >>
>>> > >>
>>> > > _______________________________________________
>>> > > Moses-support mailing list
>>> > > [email protected]
>>> > > http://mailman.mit.edu/mailman/listinfo/moses-support
>>> > _______________________________________________
>>> > Moses-support mailing list
>>> > [email protected]
>>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Moses-support mailing list
>>> > [email protected]
>>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>> >
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> Message: 3
>>> Date: Fri, 22 Jul 2011 16:38:53 +0200
>>> From: Angelina Ivanova <[email protected]>
>>> Subject: [Moses-support] GIZA++: glibc detected
>>> To: [email protected]
>>> Message-ID:
>>>
>>>  <cahklk21bie0unchhrtdvqe69ep0i5k83+jvrnm7woiohocx...@mail.gmail.com>
>>> Content-Type: text/plain; charset=ISO-8859-1
>>>
>>> Hello,
>>> I got the error below when I tried to align Russian to English. I
>>> searched the error in the Internet and found out that the cause of the
>>> problem could be in having a null sentence in the corpus. However, I
>>> didn't detect any null sentences in my corpus. The encoding is UTF8
>>> and all previous experiments with the corpus that contained the one
>>> from this as a subset, went smoothly. Could you please help me?
>>>
>>> *** glibc detected ***/moses/tools/bin/GIZA++: double free or
>>> corruption (out): 0x14901578 ***
>>> ======= Backtrace: =========
>>> [0x8166e81]
>>> [0x8168946]
>>> [0x813ebb1]
>>> [0x80e6fe9]
>>> [0x80d8420]
>>> [0x80da791]
>>> [0x806f55a]
>>> [0x80742e8]
>>> [0x814d9bb]
>>> [0x8048151]
>>> ======= Memory map: ========
>>> 00d4e000-00d4f000 r-xp 00000000 00:00 0          [vdso]
>>> 08048000-081f6000 r-xp 00000000 00:1e 1612751353  /moses/tools/bin/GIZA++
>>> 081f6000-081f8000 rw-p 001ae000 00:1e 1612751353  /moses/tools/bin/GIZA++
>>> 081f8000-081ff000 rw-p 00000000 00:00 0
>>> 082ce000-1580d000 rw-p 00000000 00:00 0          [heap]
>>> b5f00000-b5f23000 rw-p 00000000 00:00 0
>>> b5f23000-b6000000 ---p 00000000 00:00 0
>>> b6093000-b6106000 rw-p 00000000 00:00 0
>>> b6179000-b7099000 rw-p 00000000 00:00 0
>>> b70dd000-b7525000 rw-p 00000000 00:00 0
>>> b7561000-b76a7000 rw-p 00000000 00:00 0
>>> b76c0000-b7779000 rw-p 00000000 00:00 0
>>> bfb6a000-bfb7f000 rw-p 00000000 00:00 0          [stack]
>>> Exit code: 1
>>>
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>> End of Moses-support Digest, Vol 57, Issue 40
>>> *********************************************
>>
>>
>>
>> --
>> Thu.
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
**********************************************************************************
 Jörg Tiedemann                                     [email protected]
 Dep. of Linguistics and Philology
http://stp.lingfil.uu.se/~joerg/
 Uppsala University                                  tel:  +46 (0)18 - 471 1412
 Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] GIZA++: glibc detected (Angelina Ivanova)

Reply via email to