Re: [Moses-support] GIZA++: glibc detected (Angelina Ivanova)

Thu Vuong Hoai Fri, 22 Jul 2011 09:35:47 -0700

Hello,

I found your error in the issues page of Giza++, could you please check this
link http://code.google.com/p/giza-pp/issues/detail?id=15, I've thought it's
not enough good for you but I want to ask about issue 11, do you fix it? and
could you plz, provide more information about your environment?


On Fri, Jul 22, 2011 at 11:04 PM, <[email protected]> wrote:

> Send Moses-support mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>   1. Re: Using Moses language models (Barry Haddow)
>   2. Re: Using Moses language models (Marc LEGENDRE)
>   3. GIZA++: glibc detected (Angelina Ivanova)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 22 Jul 2011 09:14:47 +0100
> From: Barry Haddow <[email protected]>
> Subject: Re: [Moses-support] Using Moses language models
> To: [email protected], [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain;  charset="utf-8"
>
> On Friday 22 July 2011 03:50, Hieu Hoang wrote:
> > true, & there's no right answer to it.
> >
> > I suppose 1 goal of the trunk is to make sure that the core functionality
> > of translating isn't affected too much, in terms of quality, speed, or
> > memory. ANother goal is to make not to overburden the API with things
> > no-one else uses or implement.
> >
> > therefore, i think a good strategy is to branch & do what you like
> >
>
> Hi Hieu
>
> I'm not sure I see the point of implementing this in a branch and never
> merging. That's not a branch, it's a fork. The point of doing a small
> change
> like this in a branch would be so that the LM interface experts (ie you and
> Ken and ...) could have a look at it before it gets merged in.
>
> As regards how to implement the interface changes, what would be the
> consequences of having other LM implementations throw an exception or an
> assert for ngram_length? I think returning -1 is a very bad idea,
> especially
> as the return value is probably a size_t, and returning 0 could also lead
> to
> subtle and confusing behaviour. However if there is a return value with the
> semantics of "don't know" then that would be the ideal solution.
>
> cheers - Barry
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 22 Jul 2011 10:21:44 +0200 (CEST)
> From: Marc LEGENDRE <[email protected]>
> Subject: Re: [Moses-support] Using Moses language models
> To: [email protected]
> Cc: [email protected]
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset=ISO-8859-15
>
> Well, we (me and the people I work with) were hoping not to have to
> maintain
> a modified version of Moses.
>
> Luckily, obviousness just hit me like a truck : if something is specific to
> a LM,
> it does not have to be in the top layer.
> Having a common interface does not prevent subclasses from having a
> specific behaviour,
> we could have a LanguageModelKen method, say GetValueForgotStateKen(...)
> which would return
> something specific, say a LMKenResult, which would contain a LMResult plus
> others things
> like, say, a ngram_length field :-).
> And the virtual GetValueForgotState() method would simply return the
> LMResult from there.
>
> This way, no need to break the high level API,
> and no extra maintenance cost for us (me and the peop... Well, you know).
>
> ----- Mail original -----
> > De: "Hieu Hoang" <[email protected]>
> > ?: "Kenneth Heafield" <[email protected]>
> > Cc: [email protected]
> > Envoy?: Vendredi 22 Juillet 2011 04:50:14
> > Objet: Re: [Moses-support] Using Moses language models
> >
> >
> > true, & there's no right answer to it.
> >
> > I suppose 1 goal of the trunk is to make sure that the core
> > functionality of translating isn't affected too much, in terms of
> > quality, speed, or memory. ANother goal is to make not to overburden
> > the API with things no-one else uses or implement.
> >
> > therefore, i think a good strategy is to branch & do what you like
> >
> >
> > On 21 July 2011 22:46, Kenneth Heafield < [email protected] >
> > wrote:
> >
> >
> > Marc makes a good point. When one language model provides more
> > information than do other language models, it's difficult to maintain
> > a
> > common abstraction layer. Currently we're looking at n-gram length.
> > SRILM doesn't provide access to that (but you can get right-looking
> > state length which is usually the same thing).
> >
> > I'm working on making this issue more severe with left-looking state
> > optimization and explicit hypothesis bounds. How do we change the
> > decoder to use these features if not all of the language models
> > support
> > them?
> >
> > Maybe another class in the language model hierarchy supporting these
> > additional features. But it's going to make the decoder look ugly if
> > you want to support both.
> >
> >
> >
> >
> > On 07/21/11 11:14, Hieu Hoang wrote:
> > > hi marc,
> > >
> > > it'll be good for people to see your changes.
> > >
> > > i suppose you should create a branch and make your changes in
> > > there.
> > >
> > > If there are other people interested, you can point them to your
> > > branch.
> > > If more people are interested and it doesn't affect other people
> > > too
> > > much, then we can move it to trunk.
> > >
> > > i'll email you offline with svn details
> > >
> > > On 21/07/2011 15:16, Marc LEGENDRE wrote:
> > >> Alright, I gave this a try, and it did it for me.
> > >> With kenlm, it is a ridiculously straightforward modification,
> > >> but now I'm not sure how I can submit it :
> > >> on one hand, I am not a "machine tranlation guy" and I don't
> > >> imagine myself
> > >> digging in every other LM to find how to set the ngram_length
> > >> value;
> > >> and on the other hand I would feel guilty to submit a 10-line
> > >> patch and say
> > >> "Guys, I need this, would you mind committing it and doing
> > >> yourselves the
> > >> necessary modifications in every other wrapper ?"
> > >>
> > >> How do you, Moses developers, feel about this ?
> > >> Is it acceptable / outrageously stupid if I set the value to -1 in
> > >> the other wrappers,
> > >> maybe with a TODO, and properly document it in the super class ?
> > >>
> > >> ----- Mail original -----
> > >>> De: "Kenneth Heafield"< [email protected] >
> > >>> ?: [email protected]
> > >>> Envoy?: Mercredi 13 Juillet 2011 20:53:46
> > >>> Objet: Re: [Moses-support] Using Moses language models
> > >>>
> > >>> I'd suggest adding a ngram_length member to LMResult then
> > >>> modifying
> > >>> each
> > >>> model's wrapper (or just mine) to set that value.
> > >>>
> > >>> You're welcome to move stuff from LanguageModelKen.cpp to
> > >>> LanguageModelKen.h as necessary. I chose this setup to minimize
> > >>> unnecessary includes.
> > >>>
> > >>> Kenneth
> > >>>
> > >>> On 07/13/11 14:33, Marc LEGENDRE wrote:
> > >>>> Well, not only the header is not "public", so to speak, (which I
> > >>>> agree is not a major obstacle)
> > >>>> but also the desired pointer is a private member of the class,
> > >>>> and
> > >>>> sadly lacks a getter.
> > >>>> As far as I know, it means that accessing it will involve
> > >>>> questionnable C++ tricks.
> > >>>> (never tried, though)
> > >>>>
> > >>>> If modifying Moses is not too much of a chore, I'll give it a
> > >>>> thought.
> > >>>>
> > >>>> Anyway, thank you for your answers.
> > >>>>
> > >>>> ----- Mail original -----
> > >>>>> De: "Hieu Hoang"< [email protected] >
> > >>>>> ?: [email protected]
> > >>>>> Envoy?: Mercredi 13 Juillet 2011 18:40:11
> > >>>>> Objet: Re: [Moses-support] Using Moses language models
> > >>>>> i guess lm::Model is specific to the ken lm implementation. If
> > >>>>> you
> > >>>>> want
> > >>>>> use it you should include the header yourself and cast whatever
> > >>>>> you
> > >>>>> need
> > >>>>> to get the pointer.
> > >>>>>
> > >>>>> if you're feeling generous, maybe you can extend the moses LM
> > >>>>> wrapper
> > >>>>> so
> > >>>>> that all LM implementations have the opportunity to return the
> > >>>>> length
> > >>>>> n-gram match.
> > >>>>>
> > >>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote:
> > >>>>>> The length of the n-gram match is sufficient for I want,
> > >>>>>> indeed.
> > >>>>>> I figured out how to do get it using directly kenlm, but as I
> > >>>>>> am
> > >>>>>> running the decoder, I wanted to use the already loaded LM.
> > >>>>>>
> > >>>>>> I first tried to dig my way through the Moses abstraction
> > >>>>>> layers
> > >>>>>> to
> > >>>>>> retrieve a pointer to a lm::Model from kenlm, but the
> > >>>>>> Moses::LanguageModelKen header is not part of the public
> > >>>>>> headers
> > >>>>>> of
> > >>>>>> Moses ; that's why I tried to use only Moses interface.
> > >>>>>>
> > >>>>>> (I did I did not mention this alternative ; If someone knows
> > >>>>>> how
> > >>>>>> to
> > >>>>>> get such a pointer, I can carry on from there)
> > >>>>>>
> > >>>>>>
> > >>>>>> ----- Mail original -----
> > >>>>>>> De: "Kenneth Heafield"< [email protected] >
> > >>>>>>> ?: "Marc LEGENDRE"< [email protected] >
> > >>>>>>> Envoy?: Mercredi 13 Juillet 2011 16:12:27
> > >>>>>>> Objet: Re: [Moses-support] Using Moses language models
> > >>>>>>> The definition of unknown is that the word you asked for (the
> > >>>>>>> rightmost
> > >>>>>>> one) is mapped to<unk> i.e. an OOV.
> > >>>>>>>
> > >>>>>>> Are you looking for:
> > >>>>>>>
> > >>>>>>> 1) Length of n-gram matched in the model
> > >>>>>>>
> > >>>>>>> or
> > >>>>>>>
> > >>>>>>> 2) Length of state you must keep for valid continuation to
> > >>>>>>> the
> > >>>>>>> right
> > >>>>>>>
> > >>>>>>> These are slightly different things due to state
> > >>>>>>> minimization.
> > >>>>>>> The
> > >>>>>>> moses abstraction layer does not return either in a general
> > >>>>>>> way.
> > >>>>>>> However, if you're using KenLM, #2 is in the returned state's
> > >>>>>>> valid_length_. Further, #1 is in
> > >>>>>>> FullScoreReturn.ngram_length.
> > >>>>>>> So
> > >>>>>>> if
> > >>>>>>> you call KenLM directly these are easy to obtain (and you can
> > >>>>>>> decide
> > >>>>>>> whether to expose them through the Moses abstraction layer).
> > >>>>>>>
> > >>>>>>> Outside the decoder, you can run
> > >>>>>>>
> > >>>>>>> kenlm/query model_file null
> > >>>>>>>
> > >>>>>>> then provide your trigrams on stdin.
> > >>>>>>>
> > >>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa null
> > >>>>>>>
> > >>>>>>> looking on a
> > >>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513
> > >>>>>>> Total: -1.79818 OOV: 0
> > >>>>>>>
> > >>>>>>> The format is "word=vocab_id ngram_length score". So this is
> > >>>>>>> a
> > >>>>>>> trigram
> > >>>>>>> in the model because "a=5 3" appears.
> > >>>>>>>
> > >>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote:
> > >>>>>>>> Hello,
> > >>>>>>>>
> > >>>>>>>> I am trying to use the language models loaded by Moses ;
> > >>>>>>>>
> > >>>>>>>> I am using a 3-gram LM, and I need to know whether it
> > >>>>>>>> contains
> > >>>>>>>> a
> > >>>>>>>> given N-gram or not.
> > >>>>>>>> I tried to play around with
> > >>>>>>>> LanguageModelImplementation::GetValueForgotState(...),
> > >>>>>>>> but the boolean 'unknown' in the returned structure does not
> > >>>>>>>> seem
> > >>>>>>>> to
> > >>>>>>>> be what I'm looking for.
> > >>>>>>>>
> > >>>>>>>> Is there any simple way of getting this piece of information
> > >>>>>>>> ?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Marc Legendre
> > >>>>>>>> _______________________________________________
> > >>>>>>>> Moses-support mailing list
> > >>>>>>>> [email protected]
> > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> > >>>>>> _______________________________________________
> > >>>>>> Moses-support mailing list
> > >>>>>> [email protected]
> > >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> > >>>>>>
> > >>>>>>
> > >>>>> _______________________________________________
> > >>>>> Moses-support mailing list
> > >>>>> [email protected]
> > >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> > >>>> _______________________________________________
> > >>>> Moses-support mailing list
> > >>>> [email protected]
> > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> > >>> _______________________________________________
> > >>> Moses-support mailing list
> > >>> [email protected]
> > >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> > >>>
> > >> _______________________________________________
> > >> Moses-support mailing list
> > >> [email protected]
> > >> http://mailman.mit.edu/mailman/listinfo/moses-support
> > >>
> > >>
> > > _______________________________________________
> > > Moses-support mailing list
> > > [email protected]
> > > http://mailman.mit.edu/mailman/listinfo/moses-support
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 22 Jul 2011 16:38:53 +0200
> From: Angelina Ivanova <[email protected]>
> Subject: [Moses-support] GIZA++: glibc detected
> To: [email protected]
> Message-ID:
>        <cahklk21bie0unchhrtdvqe69ep0i5k83+jvrnm7woiohocx...@mail.gmail.com
> >
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hello,
> I got the error below when I tried to align Russian to English. I
> searched the error in the Internet and found out that the cause of the
> problem could be in having a null sentence in the corpus. However, I
> didn't detect any null sentences in my corpus. The encoding is UTF8
> and all previous experiments with the corpus that contained the one
> from this as a subset, went smoothly. Could you please help me?
>
> *** glibc detected ***/moses/tools/bin/GIZA++: double free or
> corruption (out): 0x14901578 ***
> ======= Backtrace: =========
> [0x8166e81]
> [0x8168946]
> [0x813ebb1]
> [0x80e6fe9]
> [0x80d8420]
> [0x80da791]
> [0x806f55a]
> [0x80742e8]
> [0x814d9bb]
> [0x8048151]
> ======= Memory map: ========
> 00d4e000-00d4f000 r-xp 00000000 00:00 0          [vdso]
> 08048000-081f6000 r-xp 00000000 00:1e 1612751353  /moses/tools/bin/GIZA++
> 081f6000-081f8000 rw-p 001ae000 00:1e 1612751353  /moses/tools/bin/GIZA++
> 081f8000-081ff000 rw-p 00000000 00:00 0
> 082ce000-1580d000 rw-p 00000000 00:00 0          [heap]
> b5f00000-b5f23000 rw-p 00000000 00:00 0
> b5f23000-b6000000 ---p 00000000 00:00 0
> b6093000-b6106000 rw-p 00000000 00:00 0
> b6179000-b7099000 rw-p 00000000 00:00 0
> b70dd000-b7525000 rw-p 00000000 00:00 0
> b7561000-b76a7000 rw-p 00000000 00:00 0
> b76c0000-b7779000 rw-p 00000000 00:00 0
> bfb6a000-bfb7f000 rw-p 00000000 00:00 0          [stack]
> Exit code: 1
>
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 57, Issue 40
> *********************************************
>



-- 
Thu.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] GIZA++: glibc detected (Angelina Ivanova)

Reply via email to