Hi all, Thank you all for allowing me to introduce myself and my proposal.
My proposal is building " Speech to text Enhancement Engine ", it was originally a jira issue [1] - Stanbol-1007 . TBD enhancement engine uses Sphinx library to convert the captured audio. Media (audio/video) data file is parsed with the ContentItem and formatted to proper audio format by Xuggler libraries. Audio speech is than extracted by Sphinx to 'plain/text' with the annotation of temporal position of the extracted text. Sphinx uses acoustic model and language model to map the utterances with the text, so the engine will also provide support of uploading acoustic model and language model. As mentioned in my proposal, my project mainly involves 3 parts: 1) Developing module to extract sound from audio/video data in following format: 16 kHz, 16 bit, mono, little-endian using Xuggler libraries. 2) Sphinx module to process the sound file to text with proper annotations using CMU Sphinx4 libraries. 3) Developing Enhancement engine and implement all these modules. Currently I am working on the 1st module in discussion with my mentor Andreas Kuckartz. Now about myself, I am pre-final year B. Tech student in Computer Science and Engineering at Laxmi Niwas Mittal Institute of Information Technology. I am into innovation, some of the projects which I was involved in the past is : My friend and I developed a very efficient method of head tracking using goggle to corresponding mouse pointer movement. It helps physically underprivileged person by hand to interact with computer efficiently [2]. More of my details can be found in my linkedin profile[3]. I have never been involved with an open source organization before, this will be very nice experience working with you all people. Happy Bonding. [1] https://issues.apache.org/jira/browse/STANBOL-1007 [2] http://gstanwar.blogspot.in/2014/01/avatar-goggle-by-which-disabledhand-and.html [3] http://in.linkedin.com/in/ssumansaurabh Regards, Suman Saurabh On Mon, May 5, 2014 at 1:54 PM, Antonio David Perez Morales < ape...@zaizi.com> wrote: > Hi Rupert, Florent and all > > My accepted project is "Enhancement Workflows. Enterprise Integration > Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides > a set of components for Semantic Content Management. One of the components > is the Enhancer, which can be used to extract features from content. The > Enhancer is organized using Enhancements Chains, which defines how the > content will be processed but they don't allow to integrate the current > process with the business layer. The goal of the project is to bring EIP to > Stanbol for easing the integration of the Enhancement workflows within the > business layer of enterprise systems. In order to achieve this, Apache > Camel framework is intended to be used as EIP pprovider. > > About my person, I hold a graduate degree in Computer Science Engineering > from the University of Seville and I am currently finishing a Master in > Software Engineering and Technology at that institution. I consider myself > hardworking, problem-solving, quick-learning an open source lover mainly > interested in all related with new technologies either web, mobile or > desktop. I love learning new things and facing new challenges every day. I > have coded for a long time with Java, PHP and Javascript. I use them on my > daily work. I can write clean and structured code following code rules and > applying well-known design patterns to improve the quality and maintenance > of the code. Last year, I have been working as Senior Software Engineer at > the R&D division of Zaizi, an open source consultant specialized in Content > and Enterprise Content Management Systems. Apache Stanbol is one of the > main components in our current technical stack; therefore, I have been > widely working with it in the last months, both making integrations with > different enterprise systems like ECMs and directly contributing to the > project. As a result of this effort, I have been confirmed as committer of > the project since January 2014. > > > Regarding the project, I have been taking a look at Florent code about the > first approach to integrate Camel into Stanbol. Moreover I have already > started to read more and play with Camel (and Camel Spring) to refresh and > familiarize with it (because I worked with Camel several years ago). As a > first example (which is one of the tasks I want to do in the integration) I > have been able to deploy in a local folder some files with example Camel > routes defined in XML (camel-spring) and these routes are automatically > loaded by the example application I have deployed. This way, we can achieve > something similar to the indexing tool, where the indexing result files are > put in a directory inside Stanbol and automatically the new Entityhub is > generated from those files. > > I have also read the mail Florent pointed out in a previous mail about the > potential Camel protocols (components) which can be developed to map > Chains, Engines and Stores but I would prefer to talk with Florent first to > decide the tasks to be done and the order of them, because I know the > proposal is very ambitious but achievable. > > So, as first steps (and while waiting to talk with Florent through IRC > channel or whatever) I will continue playing with Camel and I will review > again the current Florent code to have a clearer idea on how to improve > this code in order to be integrated as a first version of the Enhancement > Workflows. > > Please, comments are more than welcome. > > Regards > > ------------------------- > > [1] https://issues.apache.org/jira/browse/STANBOL-1008 > > > On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler < > rupert.westentha...@gmail.com> wrote: > > > Hi all, > > > > Thx florent for the reminder. I would like to ask all 4 Students to > > > > 1. write a mail on this list with a short summary of the GSoC project > > (project summary + link to the stanbol issue, some info about the > > student, first steps). IMO this is important as the Proposals itself > > are not fully public available. > > 2. to join the #stanbol IRC list on freenode.org (also mentors are > > welcome to join ^^). Having the people around on IRC really helps to > > answer simple questions fast. > > > > and welcome to GSoC 2014! > > > > best > > Rupert > > > > On Thu, May 1, 2014 at 1:05 PM, florent andré > > <florent.andre-...@4sengines.com> wrote: > > > Hi there ! > > > > > > As you may notice Gsoc community bonding period has begin for some time > > now. > > > > > > Speaking for Camel/Stanbol integration [1], the good proposal from > > Antonio > > > was accepted ! Congrats ! > > > So Antonio, now bonding have to start! :) > > > > > > From my point of view, a good way to bond the community to this > > integration > > > could be to create sub-issues to the "can be considered as the main > one" > > > STANBOL-1008. So we can see more specific actions you will take and > > discuss > > > specific parts in the related issue, and get a global overview when > > looking > > > at the parent issue. > > > > > > Antonio what do you think ? Can you do that ? > > > > > > As a side point, I remembered this morning this mail [2] exchange that > > can > > > give you pointer or idea for an "easy to set up throw REST" Camel's > > routes / > > > flowchart. > > > > > > Happy bonding ! > > > ++ > > > > > > > > > [1] be warned, don't know if any-one can access it : > > > > > > https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120 > > > > > > [2] > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3c4fdfc494.3090...@4sengines.com%3E > > > > > > > > -- > > | Rupert Westenthaler rupert.westentha...@gmail.com > > | Bodenlehenstraße 11 ++43-699-11108907 > > | A-5500 Bischofshofen > > | > REDLINK.CO.......................................................................... > > | http://redlink.co/ > > > > -- > > ------------------------------ > This message should be regarded as confidential. If you have received this > email in error please notify the sender and destroy it immediately. > Statements of intent shall only become binding when confirmed in hard copy > by an authorised signatory. > > Zaizi Ltd is registered in England and Wales with the registration number > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, > London W6 7AN. >