Welcome, Thejan! It's great to have you on board! Tyler
(Catching up on some old email.) On Thu, May 10, 2018, 4:52 PM Thejan Wijesinghe <[email protected]> wrote: > *Hi Chris Mattmann,Thank you for the invitation.Hi everyone,First of all, I > should say, I am very excited to be on board. Being a PMC member in Tika is > a huge accomplishment because Tika is one of those TLPs in Apache with a > history of more than 10 years.I’m currently a final year undergraduate at > Univ. of Moratuwa, Sri Lanka. I found a keen interest in information > retrieval, data science and machine learning related domains. Tika, being > one of the key technologies, used in many information retrieval > applications, I got the opportunity to work with Tika, couple of years back > but never got the chance to use Tika for an industry level application > until my internship. During my internship, I worked with a startup in SL, > to build their own cognitive platform where I had to use some of the Apache > technologies such as Kafka, Solr, Superset(incubating) and Tika. We could > successfully complete the initial version of the platform and I still work > as an external consultant for the same project. However, becoming a > committer to Apache Tika was one of the life goals I set when I got > selected as the Google Summer of Code intern at Apache Tika in 2017. My > project was “Supporting Image-to-Text (Image Captioning) in Tika for Image > MIME Types”[1], it was an amazing project idea by Thamme Gowda, which lots > of people paid so much attention. I was mentored by Chris Mattmann and > Thamme Gowda. I feel myself very lucky to have met these two people in my > life, because not for them, I don’t think, I would ever find the guidance > to become a PMC member or a committer. Most of my contributions are related > to enhancing ML based capabilities in Tika. I have many future plans to > improve the Tika-dl module. Including a parser with NMT based translation, > a sentiment parser, a dl4j based captioning parser to tika-dl. I also love > to improve Tika’s capabilities in mime type detection and language > detection. Other than that, I would love to clean up some of the parsers in > Tika. Our code base is quite a big one, evolved throughout many years and I > have seen instances where some of the parsers, not being in their > appropriate place, just to point out as an example, we have an age > recognizer parser in the Tika-nlp module while having a sentiment parser > under Tika-parsers module. I know that’s quite a lot of plans, I got there > for Tika, but I have nothing to be afraid of because I got an entire > lifetime to accomplish them.[1] > https://issues.apache.org/jira/browse/TIKA-2262 > <https://issues.apache.org/jira/browse/TIKA-2262> Thanks and Best > Regards,ThejanW* > > > On Tue, May 8, 2018 at 12:10 AM Chris Mattmann <[email protected]> > wrote: > > > Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and > > committer! > > > > > > > > Please say a bit about yourself…thanks! > > > > > > > > Cheers, > > > > Chris > > > > > > > > > > > > > > > > >
