Great that sound awesome Anthony. Friday at 10am PT it is. Please add chris.mattm...@gmail.com to your GHangout buddy list.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Anthony Beylerian <anthonybeyler...@hotmail.com> Reply-To: "dev@opennlp.apache.org" <dev@opennlp.apache.org> Date: Tuesday, March 29, 2016 at 8:32 AM To: "dev@opennlp.apache.org" <dev@opennlp.apache.org>, Mondher Bouazizi <mondher.bouaz...@gmail.com>, Madhawa Kasun Gunasekara <madhaw...@gmail.com> Cc: "d...@tika.apache.org" <d...@tika.apache.org>, Information and Data Science Group USC List <ird...@mymaillists.usc.edu> Subject: RE: GSOC2016 Sentiment Analysis >Dear Chris, > >Thank you again for reviewing our proposals. >We are looking forward to working together on this. > >In our previous trials we have used an annotated corpus made through >crowdflower for testing, and would be happy to share. >Although relatively modest and noisy (~10k training ~8k testing ~20k >pattern extraction) we believe it was enough to demonstrate encouraging >performances. >From our side, we also have a Java implementation that we would like to >shape up for production, however I'm also comfortable with Python in case >we will need it. > >On the other hand, it sounds intriguing to use a cross-lingual corpus, we >would love to discuss it. >As for the hangout session, I have just checked with Mondher and the time >works for us. > >Best, > >Anthony > > >> From: chris.a.mattm...@jpl.nasa.gov >> To: mondher.bouaz...@gmail.com; madhaw...@gmail.com >> CC: anthonybeyler...@hotmail.com; dev@opennlp.apache.org; >>d...@tika.apache.org; ird...@mymaillists.usc.edu >> Subject: Re: GSOC2016 Sentiment Analysis >> Date: Tue, 29 Mar 2016 13:57:11 +0000 >> >> I like both of your comments Mondher and Madhawa. My team at USC >> has been investigating the use of particular corpuses including >> Fisher Callhome so as to support sentiment analysis. We have been >> writing Java code outside of both OpenNLP and Tika but with the >> goal of integrating them into both. We have a mix of Java and >> Python code that we’d like to bring into both projects. >> >> I’m reviewing the proposals you wrote now, but would it make sense >> to have a Google hangout this Friday, ~10am PT Los Angeles/time? >> >> Cheers, >> Chris >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Chief Architect >> Instrument Software and Science Data Systems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: chris.a.mattm...@nasa.gov >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Director, Information Retrieval and Data Science Group (IRDS) >> Adjunct Associate Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> WWW: http://irds.usc.edu/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> -----Original Message----- >> From: Mondher Bouazizi <mondher.bouaz...@gmail.com> >> Date: Monday, March 28, 2016 at 11:46 PM >> To: Madhawa Kasun Gunasekara <madhaw...@gmail.com>, jpluser >> <chris.a.mattm...@jpl.nasa.gov> >> Cc: Anthony Beylerian <anthonybeyler...@hotmail.com>, >> "dev@opennlp.apache.org" <dev@opennlp.apache.org>, "d...@tika.apache.org" >> <d...@tika.apache.org>, Information and Data Science Group USC List >> <ird...@mymaillists.usc.edu> >> Subject: Re: GSOC2016 Sentiment Analysis >> >> >Dear Madhawa, >> > >> > >> >Thank you for your interest in the proposals. >> >The current tasks we proposed refer to the classification and >> >quantification regardless of the topic. >> >This can be used in a larger context where the topic is not specified, >>or >> >not unique, in which case we will need to identify the topic(s). >> >Therefore, a topic detector would be a good idea to implement, in order >> >to complement this. >> > >> > >> >As for the Document Categorizer, it is a general purpose component with >> >basic features (n-gram, bag of words, etc.). >> > >> >It is basically used for the classification of texts into a set of >> >classes defined by the user, whether they are sentiment classes or >>other. >> > >> >However it doesn't perform well for this purpose. >> > >> >Furthermore, the sentiment analysis component would not just perform >>the >> >naive classification but also additional tasks (e.g., quantification) >>and >> >implement more specific and sophisticated approaches. >> > >> > >> >Please share your thoughts. >> > >> > >> >Mondher >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >On Tue, Mar 29, 2016 at 1:51 PM, Madhawa Kasun Gunasekara >> ><madhaw...@gmail.com> wrote: >> > >> >Hi Chris / Antony >> > >> > >> >yes I would like to work on this, This proposal address most of the >> >things in Sentiment analysis, >> > >> >AFAIK most of the people use OpenNLP Document Categorizer for Sentiment >> >Analysis, since there isn't a proper functionality to do sentiment >> >analysis in OpenNLP, This would be great if we can add this feature on >> >OpenNLP project, and also I would like to suggest >> > that we should able to detect the target object of the opinions from >> >this feature as well. >> > >> > >> >WDYT ?? >> > >> > >> > >> >Thanks, >> > >> >Madhawa >> > >> > >> >Madhawa >> > >> > >> > >> > >> >On Tue, Mar 29, 2016 at 2:11 AM, Mattmann, Chris A (3980) >> ><chris.a.mattm...@jpl.nasa.gov> wrote: >> > >> >Dear Anthony, >> > >> >Great! These both sound like fantastic proposals and I’m happy >> >to be a mentor. Madhawa, would you like to join in on these >> >efforts? >> > >> >Cheers, >> >Chris >> > >> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >Chris Mattmann, Ph.D. >> >Chief Architect >> >Instrument Software and Science Data Systems Section (398) >> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> >Office: 168-519, Mailstop: 168-527 >> >Email: chris.a.mattm...@nasa.gov >> >WWW: >> >http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/~mattmann/> >> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >Director, Information Retrieval and Data Science Group (IRDS) >> >Adjunct Associate Professor, Computer Science Department >> >University of Southern California, Los Angeles, CA 90089 USA >> >WWW: http://irds.usc.edu/ >> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > >> > >> > >> > >> > >> >-----Original Message----- >> >From: Anthony Beylerian <anthonybeyler...@hotmail.com> >> >Date: Monday, March 28, 2016 at 11:48 AM >> >To: "dev@opennlp.apache.org" <dev@opennlp.apache.org>, >> >"mondher.bouaz...@gmail.com" <mondher.bouaz...@gmail.com> >> >Cc: Madhawa Kasun Gunasekara <madhaw...@gmail.com>, jpluser >> ><chris.a.mattm...@jpl.nasa.gov> >> >Subject: RE: GSOC2016 Sentiment Analysis >> > >> >>Dear Chris, >> >> >> >>Thank you for starting the discussion. >> >>We are glad there is an interest in a sentiment analysis component. >> >> >> >>My colleague Mondher posted the two JIRA issues related to Sentiment >> >>Analysis [1][2] as references for our proposals [3][4] for GSoC. >> >>In fact, we have been researching this topic at our university. >> >>We are hoping to participate this year and work on integrating both a >> >>sentiment classifier and a quantifier for the library. >> >> >> >>It would be nice to also have an interface with Tika, maybe we can >> >>collaborate ? >> >>We are also looking for mentors, in case someone is willing to support >> >>our proposals. >> >> >> >>Best, >> >> >> >>Anthony >> >> >> >>[1] >> >https://issues.apache.org/jira/browse/OPENNLP-842 >> ><https://issues.apache.org/jira/browse/OPENNLP-842> >> >>[2] >> >https://issues.apache.org/jira/browse/OPENNLP-840 >> ><https://issues.apache.org/jira/browse/OPENNLP-840> >> >>[3] >> >>>>https://docs.google.com/document/d/1nVnwpmGaOnwHERXr55IClE4V87jUX2sva-m >>>>kg >> >>W >> >>nR8n0/edit?usp=sharing >> >>[4] >> >>>>https://docs.google.com/document/d/1x02II9W3rirtuSbx_sY8kOQZSgOp0SIKeIW >>>>TC >> >>X >> >>EOJvo/edit?usp=sharing >> >> >> >>> From: chris.a.mattm...@jpl.nasa.gov >> >>> To: nishant....@gmail.com >> >>> CC: dev@opennlp.apache.org; >> >madhaw...@gmail.com; >> >hmanj...@usc.edu <mailto:hmanj...@usc.edu>; >> >>>kamal...@usc.edu >> >>> Subject: Re: GSOC2016 Sentiment Analysis >> >>> Date: Sun, 27 Mar 2016 19:34:24 +0000 >> >>> >> >>> No problem - I just wanted to encourage discussion thank you for >> >>> your prompt and courteous replies. >> >>> >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>> Chris Mattmann, Ph.D. >> >>> Chief Architect >> >>> Instrument Software and Science Data Systems Section (398) >> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> >>> Office: 168-519, Mailstop: 168-527 >> >>> Email: chris.a.mattm...@nasa.gov >> >>> WWW: >> >http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/~mattmann/> >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>> Director, Information Retrieval and Data Science Group (IRDS) >> >>> Adjunct Associate Professor, Computer Science Department >> >>> University of Southern California, Los Angeles, CA 90089 USA >> >>> WWW: http://irds.usc.edu/ >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >