Nick, It was great to read through http://events.linuxfoundation.org/sites/events/files/slides/WhatsNewWithApacheTika_1.pdf… Wow there is a lot in Tika.
And I think that might be the one challenge with the talk structure, there is SOO much information. I think I’d like to see “How does Tika actually architected” to support so many amazing use cases. If this talk is meant for folks who don’t already know a lot about the project, then they might get overwhelmed with the long lists, such as all the file types it can handle. Maybe change some of them to “here is an eye chart of logos, don’t actually read it” and consolidate some pages. Eric > On May 16, 2017, at 10:38 AM, Thamme Gowda <[email protected]> wrote: > > Nick, > Here are some pointers: > 1. Image recognition using Tensorflow: > https://wiki.apache.org/tika/TikaAndVision; Link to Paper: > https://memex.jpl.nasa.gov/MFSEC17.pdf > 2. Image Recognition using Deeplearning4j - > https://wiki.apache.org/tika/TikaAndVisionDL4J > 3. Sentiment Analysis using OpenNLP: https://github.com/apache/tika/pull/169 > 4. Video labeling using tensorflow image rec: > https://wiki.apache.org/tika/TikaAndVisionVideo > 5. Named Entity Extraction using OpenNLP and CoreNLP: > https://wiki.apache.org/tika/TikaAndNER > > *Coming soon (Work in progress):* > 6. Image Captioning (Image-to-Text) https://github.com/apache/tika/pull/180 > > Cheers, > -Thamme > > *--* > *Thamme Gowda* > TG | @thammegowda <https://twitter.com/thammegowda> > ~Sent via somebody's Webmail server! > > On Tue, May 16, 2017 at 6:59 AM, Chris Mattmann <[email protected]> wrote: > >> Yep, literally take a look at the Tika wiki – there are examples a plenty >> and even >> screen shots. Further, if you look at the MEMEX site under our new >> publications >> section, there are a few examples (like the ICMR paper on forensics) that >> show it >> in action. >> >> http://memex.jpl.nasa.gov/#publications >> >> >> >> On 5/16/17, 6:21 AM, "Konstantin Gribov" <[email protected]> wrote: >> >> IIRC, image and video labeling basic support was added (Chris & Thamme >> could you elaborate on that, please), TSD (TIKA-2309, time stamped data >> envelope format) support, slf4j migration (ongoing on 2.x branch). >> >> вт, 16 мая 2017 г. в 16:06, Allison, Timothy B. <[email protected]>: >> >>> Doh! Sorry for the delay...might add configuration of >> EncodingDetectors, >>> but that's probably too far into the weeds? >>> >>> -----Original Message----- >>> From: Nick Burch [mailto:[email protected]] >>> Sent: Sunday, May 14, 2017 11:34 AM >>> To: [email protected] >>> Subject: Tika talk next week - help needed! >>> >>> Hi All >>> >>> Last year in Seville, I gave a talk on Tika entitled "Apache Tika - >> What’s >>> new with 2.0?". For ApacheCon Miami next week, I've been roped into >> giving >>> an updated version... >>> >>> https://apachecon2017.sched.com/event/9zvD/apache-tika- >> whats-new-with-20-nick-burch-apache-software-foundation >>> >>> My slides from Seville are available at: >>> >>> http://events.linuxfoundation.org/sites/events/files/slides/ >> WhatsNewWithApacheTika_1.pdf >>> >>> Beyond updating the list of releases and parsers, and the slide >>> background, what should I change? >>> >>> Maybe some more on Tika eval? More details on some of the NLP / >> Entity >>> Recognition / Image Recoginition stuff? Some screenshots of that >> stuff? >>> More on translation? Something else? >>> >>> Ideas greatly appreciated! Good screenshots even more so :) >>> >>> Cheers >>> Nick >>> >> -- >> >> Best regards, >> Konstantin Gribov >> >> >> >> _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
