Hi All,
My accepted proposal is "Integrate YAGO and AIDA NED with Apache Stanbol"
which addresses the issue STANBOL-1295 [1]. YAGO is a semantic knowledge
base similar to dbpedia and freebase, but provides much cleaner thematic
domains and has useful representation of spatial, temporal and context
information about entities. As the initial part of this project, YAGO will
be integrated as a referenced site in Stanbol.

AIDA is a framework developed for entity disambiguation which uses YAGO as
the knowledge base for disambiguation. Even though my proposal aimed at
integrating AIDA, with its conflicting licence it is not possible to
integrate AIDA. Therefore after the initial discussion with my assigned
mentor Rafa, this project will address pending task related to
disambiguation, Jira issue   STANBOL-1183 [2]. That is developing a
disambiguation API for Stanbol.

I have started looking in to Stanbol indexing tool and configurations. Also
getting familiar with entity disambiguation in Stanbol [3].

About myself, I am a final year undergraduate student from Department of
Computer Science and Engineering, University of Moratuwa. Recently I have
completed my internship as R&D Engineering intern at Zaizi, an open source
consultant company in enterprise content management, during which I had the
opportunity to work with open source projects such as Stanbol and Mahout.

Looking forward for a great summer of coding!

[1] https://issues.apache.org/jira/browse/STANBOL-1295
[2] https://issues.apache.org/jira/browse/STANBOL-1183
[3] https://issues.apache.org/jira/browse/STANBOL-1037

cheers,
Chalitha


On Tue, May 6, 2014 at 7:45 AM, Suman Saurabh
<ss.sumansaurab...@gmail.com>wrote:

> Hi all,
>
> Thank you all for allowing me to introduce myself and my proposal.
>
> My proposal is building " Speech to text Enhancement Engine ", it was
> originally a jira issue [1] - Stanbol-1007 . TBD enhancement engine uses
> Sphinx library to convert the captured audio. Media (audio/video) data file
> is parsed with the ContentItem and formatted to proper audio format by
> Xuggler libraries. Audio speech is than extracted by Sphinx to 'plain/text'
> with the annotation of temporal position of the extracted text. Sphinx uses
> acoustic model and language model to map the utterances with the text, so
> the engine will also provide support of uploading acoustic model and
> language model.
>
> As mentioned in my proposal, my project mainly involves 3 parts:
> 1) Developing module to extract sound from audio/video data in following
> format: 16 kHz, 16 bit, mono, little-endian using Xuggler libraries.
> 2) Sphinx module to process the sound file to text with proper annotations
> using CMU Sphinx4 libraries.
> 3) Developing Enhancement engine and implement all these modules.
>
> Currently I am working on the 1st module in discussion with my mentor
> Andreas Kuckartz.
>
> Now about myself, I am pre-final year B. Tech student in Computer Science
> and Engineering at Laxmi Niwas Mittal Institute of Information Technology.
> I am into innovation, some of the projects which I was involved in the past
> is : My friend and I developed a very efficient method of head tracking
> using goggle to corresponding mouse pointer movement. It helps physically
> underprivileged person by hand to interact with computer efficiently [2].
> More of my details can be found in my linkedin profile[3].
>
> I have never been involved with an open source organization before, this
> will be very nice experience working with you all people.
>
> Happy Bonding.
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1007
> [2]
>
> http://gstanwar.blogspot.in/2014/01/avatar-goggle-by-which-disabledhand-and.html
> [3] http://in.linkedin.com/in/ssumansaurabh
>
> Regards,
> Suman Saurabh
>
>
>
> On Mon, May 5, 2014 at 1:54 PM, Antonio David Perez Morales <
> ape...@zaizi.com> wrote:
>
> > Hi Rupert, Florent and all
> >
> > My accepted project is "Enhancement Workflows. Enterprise Integration
> > Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol
> provides
> > a set of components for Semantic Content Management. One of the
> components
> > is the Enhancer, which can be used to extract features from content. The
> > Enhancer is organized using Enhancements Chains, which defines how the
> > content will be processed but they don't allow to integrate the current
> > process with the business layer. The goal of the project is to bring EIP
> to
> > Stanbol for easing the integration of the Enhancement workflows within
> the
> > business layer of enterprise systems. In order to achieve this, Apache
> > Camel framework is intended to be used as EIP pprovider.
> >
> > About my person, I hold a graduate degree in Computer Science Engineering
> > from the University of Seville and I am currently finishing a Master in
> > Software Engineering and Technology at that institution. I consider
> myself
> > hardworking, problem-solving, quick-learning an open source lover mainly
> > interested in all related with new technologies either web, mobile or
> > desktop. I love learning new things and facing new challenges every day.
> I
> > have coded for a long time with Java, PHP and Javascript. I use them on
> my
> > daily work. I can write clean and structured code following code rules
> and
> > applying well-known design patterns to improve the quality and
> maintenance
> > of the code. Last year, I have been working as Senior Software Engineer
> at
> > the R&D division of Zaizi, an open source consultant specialized in
> Content
> > and Enterprise Content Management Systems. Apache Stanbol is one of the
> > main components in our current technical stack; therefore, I have been
> > widely working with it in the last months, both making integrations with
> > different enterprise systems like ECMs and directly contributing to the
> > project. As a result of this effort, I have been confirmed as committer
> of
> > the project since January 2014.
> >
> >
> > Regarding the project, I have been taking a look at Florent code about
> the
> > first approach to integrate Camel into Stanbol. Moreover I have already
> > started to read more and play with Camel (and Camel Spring) to refresh
> and
> > familiarize with it (because I worked with Camel several years ago). As a
> > first example (which is one of the tasks I want to do in the
> integration) I
> > have been able to deploy in a local folder some files with example Camel
> > routes defined in XML (camel-spring) and these routes are automatically
> > loaded by the example application I have deployed. This way, we can
> achieve
> > something similar to the indexing tool, where the indexing result files
> are
> > put in a directory inside Stanbol and automatically the new Entityhub is
> > generated from those files.
> >
> > I have also read the mail Florent pointed out in a previous mail about
> the
> > potential Camel protocols (components) which can be developed to map
> > Chains, Engines and Stores but I would prefer to talk with Florent first
> to
> > decide the tasks to be done and the order of them, because I know the
> > proposal is very ambitious but achievable.
> >
> > So, as first steps (and while waiting to talk with Florent through IRC
> > channel or whatever) I will continue playing with Camel and I will review
> > again the current Florent code to have a clearer idea on how to improve
> > this code in order to be integrated as a first version of the Enhancement
> > Workflows.
> >
> > Please, comments are more than welcome.
> >
> > Regards
> >
> > -------------------------
> >
> > [1] https://issues.apache.org/jira/browse/STANBOL-1008
> >
> >
> > On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
> > rupert.westentha...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Thx florent for the reminder. I would like to ask all 4 Students to
> > >
> > > 1. write a mail on this list with a short summary of the GSoC project
> > > (project summary + link to the stanbol issue, some info about the
> > > student, first steps). IMO this is important as the Proposals itself
> > > are not fully public available.
> > > 2. to join the #stanbol IRC list on freenode.org (also mentors are
> > > welcome to join ^^). Having the people around on IRC really helps to
> > > answer simple questions fast.
> > >
> > > and welcome to GSoC 2014!
> > >
> > > best
> > > Rupert
> > >
> > > On Thu, May 1, 2014 at 1:05 PM, florent andré
> > > <florent.andre-...@4sengines.com> wrote:
> > > > Hi there !
> > > >
> > > > As you may notice Gsoc community bonding period has begin for some
> time
> > > now.
> > > >
> > > > Speaking for Camel/Stanbol integration [1], the good proposal from
> > > Antonio
> > > > was accepted ! Congrats !
> > > > So Antonio, now bonding have to start! :)
> > > >
> > > > From my point of view, a good way to bond the community to this
> > > integration
> > > > could be to create sub-issues to the "can be considered as the main
> > one"
> > > > STANBOL-1008. So we can see more specific actions you will take and
> > > discuss
> > > > specific parts in the related issue, and get a global overview when
> > > looking
> > > > at the parent issue.
> > > >
> > > > Antonio what do you think ? Can you do that ?
> > > >
> > > > As a side point, I remembered this morning this mail [2] exchange
> that
> > > can
> > > > give you pointer or idea for an "easy to set up throw REST" Camel's
> > > routes /
> > > > flowchart.
> > > >
> > > > Happy bonding !
> > > > ++
> > > >
> > > >
> > > > [1] be warned, don't know if any-one can access it :
> > > >
> > >
> >
> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
> > > >
> > > > [2]
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3c4fdfc494.3090...@4sengines.com%3E
> > >
> > >
> > >
> > > --
> > > | Rupert Westenthaler             rupert.westentha...@gmail.com
> > > | Bodenlehenstraße 11                              ++43-699-11108907
> > > | A-5500 Bischofshofen
> > > |
> >
> REDLINK.CO..........................................................................
> > > | http://redlink.co/
> > >
> >
> > --
> >
> > ------------------------------
> > This message should be regarded as confidential. If you have received
> this
> > email in error please notify the sender and destroy it immediately.
> > Statements of intent shall only become binding when confirmed in hard
> copy
> > by an authorised signatory.
> >
> > Zaizi Ltd is registered in England and Wales with the registration number
> > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> > London W6 7AN.
> >
>



-- 
J.M Chalitha Udara Perera

*Department of Computer Science and Engineering,*
*University of Moratuwa,*
*Sri Lanka*

Reply via email to