I have modified the proposal for better explanation of the process. Kindly
give a look at it. The bilingual dictionary needs some work to be done, I
didn't time to complete it as I was busy determining the sentiment tag . I
will try to incorporate it as soon as possible. Please suggest if any
The sentiment tags will help to form more detailed and diverse patterns
which can help to form better rules to disambiguate, lexical selection and
reorder .
As far as those languages where sentiwordnet does not exist, a linguist
will be able to determine sentiment polarity. Since i have the
Hey I have one doubt,
The examples given for mistranslation, I didn't quite understand how
sentiment analysis would fix those.
Also what about languages for which a SentiWordNet doesn't exist?
Thanks and Regards,
Tanmai
On Fri, Mar 27, 2020 at 3:56 PM Rajarshi Roychoudhury <
Hi,
I have finished writing my proposal , wrote a code on how to do sentiment
analysis with character embedding as a coding challenge, added words to
monolingual and bilingual dictionaries and designed a constraint grammar. I
am working on building the bidix and lrx files for now.. Would be very
"A randomly generated password for Rroychoudhury has been sent to
rroychoudhu...@gmail.com."
-- Tino Didriksen
On Mon, 23 Mar 2020 at 03:10, Rajarshi Roychoudhury <
rroychoudhu...@gmail.com> wrote:
> I have completed writing my gsoc proposal, can I get a wiki account?
>
> Username:
I have completed writing my gsoc proposal, can I get a wiki account?
Username: rroychoudhury
email: rroychoudhu...@gmail.com
On Fri, Mar 6, 2020, 21:40 Rajarshi Roychoudhury
wrote:
> One is .odt format , the other in .pdf. Kindly give it a read and give
> suggestions.
> Best,
> Rajarshi
>
> On
One is .odt format , the other in .pdf. Kindly give it a read and give
suggestions.
Best,
Rajarshi
On Fri, 6 Mar 2020 at 21:15, Francis Tyers wrote:
> El 2020-03-06 15:35, Scoop Gracie escribió:
> > Sending it as .odt would be great.
> >
> > On Fri, Mar 6, 2020, 07:27 Rajarshi Roychoudhury
> >
El 2020-03-06 15:35, Scoop Gracie escribió:
Sending it as .odt would be great.
On Fri, Mar 6, 2020, 07:27 Rajarshi Roychoudhury
wrote:
Then how should I send the file. I don't know if there is anyone to
mentor this since this is not from the list of ideas mentioned .
On Fri, Mar 6, 2020,
Sending it as .odt would be great.
On Fri, Mar 6, 2020, 07:27 Rajarshi Roychoudhury
wrote:
> Then how should I send the file. I don't know if there is anyone to mentor
> this since this is not from the list of ideas mentioned .
>
> On Fri, Mar 6, 2020, 20:49 Francis Tyers wrote:
>
>> El
Then how should I send the file. I don't know if there is anyone to mentor
this since this is not from the list of ideas mentioned .
On Fri, Mar 6, 2020, 20:49 Francis Tyers wrote:
> El 2020-03-06 08:40, Rajarshi Roychoudhury escribió:
> > Hi,
> > I have written my idea in the file attached .
El 2020-03-06 08:40, Rajarshi Roychoudhury escribió:
Hi,
I have written my idea in the file attached . It is just the idea ,
not the project proposal . Kindly read the idea and give feedback on
whether this can be a feasible GSoC project.
Best,
Rajarshi
Please do not use proprietary formats
Hi,
I have written my idea in the file attached . It is just the idea , not the
project proposal . Kindly read the idea and give feedback on whether this
can be a feasible GSoC project.
Best,
Rajarshi
On Fri, 28 Feb 2020 at 06:31, Rajarshi Roychoudhury <
rroychoudhu...@gmail.com> wrote:
> Here
Here are some published papers on how character embeddings are used for
classification.
https://www.google.com/url?sa=t=web=j=https://arxiv.org/abs/1810.03595=2ahUKEwiu-ajdgvPnAhXXxzgGHQAWA3cQFjAVegQIDBAB=AOvVaw0LQ60M-KXtk-NGyAoVqmeU
https://lsm.media.mit.edu/papers/tweet2vec_vvr.pdf
Tino Didriksen
čálii:
> One major issue specific to Apertium is that the source information is no
> longer available in the target generation step.
It might make sense to have something like this right after bilingual
dictionary lookup (as an alternative or complement to lrx-proc).
Perhaps a
How exactly can characters predict sentiment? Don’t you still need some
training data for pairs? English, Hindi, Bangla aren’t really low resource
languages.
Anyway, we can continue this discussion on the IRC so that it’ll be easier and
more people can contribute to the discussion.
Tanmai
To answer the question on how to analyse sentiment on low resource language
, I think character embedding would be the best option. The words in the
corpus is not exhaustive but the number of unique characters is certainly
well deterministic. We can figure out the embedding weight for each
As I mentioned earlier, I would like to work on English-Hindi or
English-Bengali translation, the dataset can be obtained from sentiwordnet
for Indian languages,
https://amitavadas.com/sentiwordnet.php
which is by far the most resourceful dataset available for sentiment
analysis.It contains data
Hi, I have a few questions about this:
1. How would you analyse the sentiment of the source text? Considering the
language pairs that Apertium deals with are low resource languages.
2. As Tino mentions, is there a problem of sentiment loss in Apertium? Any
examples of this?
3. Doesn't the
The effect won't be very evident on simple sentences, I think it would be
more effective on sentences where choice of words can decide the efficiency
of translation. It's not about if "Watch out" could be " be careful" , it's
about choosing words that can retain the urgency in "watch out".
So, "Watch out!" Could become "Be careful"?
On Thu, Feb 27, 2020, 10:13 Rajarshi Roychoudhury
wrote:
> It is not just about minimizing loss of sentiment , it is about using
> that information for better translation. A very trivial example would be
> that for some situations , sentences can
It is not just about minimizing loss of sentiment , it is about using that
information for better translation. A very trivial example would be that
for some situations , sentences can project a strong sentiment and simple
translation may not always yield the best result. However if we can use the
My first question would be, is this actually a problem for rule-based
machine translation? I am not a linguist, but given how RBMT works I can't
really see where sentiment would be lost in the process, especially
because Apertium is designed for related languages where sentiment is
mostly the
I just need to know which libraries are used(if any STL) to store the words
and how the translation is actually done. I plan to use an ordered map to
store the word as key and sentiment value as value . I can choose the one
with best sentiment by running an iterative search. Or a better idea would
Oh okay. That should be fine.
On Thu, Feb 27, 2020, 08:24 Rajarshi Roychoudhury
wrote:
> No I just need python to get the result, which can be written in a text
> file and read using c++. It won't depend on python.
>
> On Thu, Feb 27, 2020, 21:52 Scoop Gracie wrote:
>
>> Oh, okay. So Python
No I just need python to get the result, which can be written in a text
file and read using c++. It won't depend on python.
On Thu, Feb 27, 2020, 21:52 Scoop Gracie wrote:
> Oh, okay. So Python would not be needed at runtime?
>
> On Thu, Feb 27, 2020, 08:20 Rajarshi Roychoudhury <
>
Oh, okay. So Python would not be needed at runtime?
On Thu, Feb 27, 2020, 08:20 Rajarshi Roychoudhury
wrote:
> I just need to write the dictionary I would get in python in a file and
> read it using c++. I guess I can use a map to solve my purpose.
>
> On Thu, Feb 27, 2020, 21:40 Scoop Gracie
I just need to write the dictionary I would get in python in a file and
read it using c++. I guess I can use a map to solve my purpose.
On Thu, Feb 27, 2020, 21:40 Scoop Gracie wrote:
> I believe it must use C++, so nltk won't work.
>
> On Wed, Feb 26, 2020, 23:17 Rajarshi Roychoudhury <
>
I believe it must use C++, so nltk won't work.
On Wed, Feb 26, 2020, 23:17 Rajarshi Roychoudhury
wrote:
> Formally i present my idea in this form:
> From my understanding of RBMT ,
>
> The RBMT system contains:
>
>- a *SL morphological analyser* - analyses a source language word and
>
Formally i present my idea in this form:
>From my understanding of RBMT ,
The RBMT system contains:
- a *SL morphological analyser* - analyses a source language word and
provides the morphological information;
- a *SL parser* - is a syntax analyser which analyses source language
It is absolutely fine to use languages you are most comfortable with.
On Wed, Feb 26, 2020, 22:18 Rajarshi Roychoudhury
wrote:
> I need to study more about RBMT to develop an idea of how to preserve
> sentiment while translating, which I think can increase the efficiency of
> translation. It
I need to study more about RBMT to develop an idea of how to preserve
sentiment while translating, which I think can increase the efficiency of
translation. It will also help my research , thank you so much for
suggesting it. Also, will it be okay if I work on languages I am
comfortable with? Say
I think it is worth looking into, it is just that anything that needs a
neural network is not possible. I'm sure sentiment translation is possible
in RBMT too.
On Wed, Feb 26, 2020, 21:58 Rajarshi Roychoudhury
wrote:
> Ok,then I wont pursue this idea and will look for one in the idea list .
>
>
Ok,then I wont pursue this idea and will look for one in the idea list .
On Thu, 27 Feb 2020 at 11:10, Scoop Gracie wrote:
> The main problem is that I don't believe there is a way to send
> information down the pipeline without breaking stuff.
>
> On Wed, Feb 26, 2020, 21:37 Rajarshi
The main problem is that I don't believe there is a way to send information
down the pipeline without breaking stuff.
On Wed, Feb 26, 2020, 21:37 Rajarshi Roychoudhury
wrote:
> Thank you so much for the feedback,i will try to think of any other way of
> doing this without using neural networks
Thank you so much for the feedback,i will try to think of any other way of
doing this without using neural networks or propose a new project
http://wiki.apertium.org/wiki/Apertium_for_Dummies#The_units_of_translation
is an excellent starting point for beginners, however it would be very
helpful if
I'm not an expert in this, but given the non-neural nature of Apertium,
this does not seem feasible to me, at least in the way you described.
On Wed, Feb 26, 2020, 21:02 Rajarshi Roychoudhury
wrote:
> Hi,
> I am Rajarshi Roychoudhury,a second year undergraduate student at Jadavpur
>
Hi,
I am Rajarshi Roychoudhury,a second year undergraduate student at Jadavpur
University,Kolkata,India.I have done many projects in Natural Language
Processing,mainly focussing on sentiment analysis and machine translation.
Most of the machine translation have no explicit preservation on the
37 matches
Mail list logo