This is very helpful. Thank you! Is there any use in having the tika-dl
module if our more modern approach is REST + Docker? The upkeep in tika-dl
is nontrivial.

On Fri, Jul 6, 2018 at 6:15 PM Chris Mattmann <[email protected]> wrote:

> Tim,
>
>
>
> Thanks. There are multiple modes of integrating deep learning with Tika:
>
>
> The original mode: uses Thamme’s work on REST exposing Tensorflow
> and Docker to provide a REST Service to Tika to allow for running
> Tensorflow
> DL models. We initially did Inception_v3, and a model by Madhav Sharan
> that combines OpenCV
> with Inception v3 (and a new docker that installs OpenCV it’s a pain) for
> image
> and video object recognition, respectively. See:
> https://github.com/apache/tika/pull/208
> and https://github.com/apache/tika/pull/168 and also the wiki
> Later, Thamme, Avtar Singh, KranthiGV, added DL4J support:
> https://github.com/apache/tika/pull/165
> including Inceptionv3 and VGG16 - https://github.com/apache/tika/pull/182
> This houses the model in USC Data science repo and uses it as an example
> for how to store and load models from Keras/Python into DL4j:
>
> https://github.com/USCDataScience/dl4j-kerasimport-examples/tree/master/dl4j-import-example/data
> Then, Thejan added Text Captioning and a new Docker, and trained model:
> https://github.com/apache/tika/pull/180
> Then Raunaq from UPenn added Inception v4 support via the
> Docker/Tensorflow way:
> https://github.com/apache/tika/pull/162
> All this Docker work caused Thejan and others to think we needed to
> refactor the dockers. We did
> that here: https://github.com/apache/tika/pull/208 to make them cleaner,
> and to depend on:
> http://github.com/USCDataScience/tika-dockers/ and on
> http://github.com/USCDataScience/img2text
> models for image captioning. Now, Video and Image recognition and Image
> Captioning all had the same
> base docker and sub dockers from that.
>
>
> That’s where we’re at today. Make sense? ☺ Thejan and others want to add
> more DL4J supported models
> and we can always use Tensorflow/Docker as well as a way of doing it.
>
>
>
> Cheers,
>
> Chris
>
>
>
>
>
>
>
>
>
> From: Tim Allison <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Friday, July 6, 2018 at 2:39 PM
> To: "[email protected]" <[email protected]>
> Subject: image recognition...how do the parts play together?
>
>
>
> On Twitter, Chris, Thamme, Thejan, and I are working with some
>
> deeplearning4j devs to help us upgrade to deeplearning4j 1.0.0-BETA
>
> (TIKA-2672).
>
>
>
> I initially requested help from Thejan (and Thamme :D) for this because we
>
> were getting an initialization exception after the upgrade in tika-dl's
>
> DL4JInceptionV3Net.
>
>
>
> According to our wiki[2], we upgraded to InceptionV4 in Tika-2306 by adding
>
> the TensorFlowRESTRecogniser...does this mean we can get rid of
>
> DL4JInceptionV3Net?  Or, what are we actually asking the dl4j folks to help
>
> with?
>
>
>
> How do these recognizers play together?
>
>
>
> Thank you.
>
>
>
> Cheers,
>
>
>
>          Tim
>
>
>
> [1] e.g.  https://twitter.com/chrismattmann/status/1015340483923439617
>
> [2] https://wiki.apache.org/tika/TikaAndVision
>
>
>
>

Reply via email to