Hi, I have a few questions about this: 1. How would you analyse the sentiment of the source text? Considering the language pairs that Apertium deals with are low resource languages. 2. As Tino mentions, is there a problem of sentiment loss in Apertium? Any examples of this? 3. Doesn't the sentiment analysis of a language require a decent amount of training data? Where would this data be found for low resource languages?
Tanmai On Fri, Feb 28, 2020 at 12:02 AM Rajarshi Roychoudhury < rroychoudhu...@gmail.com> wrote: > The effect won't be very evident on simple sentences, I think it would be > more effective on sentences where choice of words can decide the efficiency > of translation. It's not about if "Watch out" could be " be careful" , it's > about choosing words that can retain the urgency in "watch out". Sentiment > information on original sentence can help in that. > > On Thu, Feb 27, 2020, 23:47 Scoop Gracie <scoopgra...@gmail.com> wrote: > >> So, "Watch out!" Could become "Be careful"? >> >> On Thu, Feb 27, 2020, 10:13 Rajarshi Roychoudhury < >> rroychoudhu...@gmail.com> wrote: >> >>> It is not just about minimizing loss of sentiment , it is about using >>> that information for better translation. A very trivial example would be >>> that for some situations , sentences can project a strong sentiment and >>> simple translation may not always yield the best result. However if we can >>> use the knowledge of the sentiment to choose the words , it might give >>> better result. >>> >>> As far as the codes are concerned, I need to study the source code , or >>> a detailed documentation for proposing a feasible solution. >>> >>> Best, >>> Rajarshi >>> >>> >>> >>> On Thu, Feb 27, 2020, 23:21 Tino Didriksen <m...@tinodidriksen.com> >>> wrote: >>> >>>> My first question would be, is this actually a problem for rule-based >>>> machine translation? I am not a linguist, but given how RBMT works I can't >>>> really see where sentiment would be lost in the process, especially >>>> because Apertium is designed for related languages where sentiment is >>>> mostly the same. But even for less related languages, it would be down to >>>> the quality of the source language analysis. >>>> >>>> Beyond that, please learn how Apertium specifically works, not just >>>> RBMT in general. http://wiki.apertium.org/wiki/Documentation is a good >>>> start, but our IRC channel is the best place to ask technical questions. >>>> >>>> One major issue specific to Apertium is that the source information is >>>> no longer available in the target generation step. >>>> >>>> E.g., since you mention English-Hindi, you could install >>>> apertium-eng-hin and see how each part of the pipe works. We have >>>> precompiled binaries common platforms. Again, see wiki and IRC. >>>> >>>> -- Tino Didriksen >>>> >>>> >>>> On Thu, 27 Feb 2020 at 08:16, Rajarshi Roychoudhury < >>>> rroychoudhu...@gmail.com> wrote: >>>> >>>>> Formally i present my idea in this form: >>>>> From my understanding of RBMT , >>>>> >>>>> The RBMT system contains: >>>>> >>>>> - a *SL morphological analyser* - analyses a source language word >>>>> and provides the morphological information; >>>>> - a *SL parser* - is a syntax analyser which analyses source >>>>> language sentences; >>>>> - a *translator* - used to translate a source language word into >>>>> the target language; >>>>> - a *TL morphological generator* - works as a generator of >>>>> appropriate target language words for the given grammatica information; >>>>> - a *TL parser* - works as a composer of suitable target language >>>>> sentences >>>>> >>>>> I propose a 6th component of the RBMT system: *sentiment based TL >>>>> morphological generator* >>>>> >>>>> I propose that we do word level sentiment analysis of the source >>>>> language and targeted language. For the time being i want to work on >>>>> English-Hindi translation. We do not need a neural network based >>>>> translation, however for getting the sentiment associated with each word >>>>> we >>>>> might use nltk,or develop a character level embedding to just find out the >>>>> sentiment assosiated with each word,and form a dictionary out of it.I have >>>>> written a paper on it,and received good results.So basically,during the >>>>> final application development we will just have the dictionary,with no >>>>> neural network dependencies. This can easily be done with Python.I just >>>>> need a good corpus of English and Hindi words(the sentiment datasets are >>>>> available online). >>>>> >>>>> The *sentiment based TL morphological generator *will generate the >>>>> list of possible words,and we will take that word whose sentiment is >>>>> closest to the source language word. >>>>> This is a novel method that has probably not been applied before, and >>>>> might generate better results. >>>>> >>>>> Please provide your valuable feedwork and suggest some necessary >>>>> changes that needs to be made. >>>>> Best, >>>>> Rajarshi >>>>> >>>> _______________________________________________ >>>> Apertium-stuff mailing list >>>> Apertium-stuff@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>> >>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > -- *Khanna, Tanmai*
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff