[
https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537293#comment-16537293
]
ASF GitHub Bot commented on TIKA-2672:
--------------------------------------
chrismattmann commented on issue #241: Fix for TIKA-2672
URL: https://github.com/apache/tika/pull/241#issuecomment-403554981
Inceptionv3 works great!
## Inception server
```
nonas:tika2.0.0 mattmann$ tika
--config=tika-dl/src/test/resources/org/apache/tika/dl/imagerec/dl4j-inception3-config.xml
Jul 09, 2018 10:18:53 AM
org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed.
See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
for optional dependencies.
Jul 09, 2018 10:18:53 AM
org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
WARNING: Tesseract OCR is installed and will be automatically applied to
image files unless
you've excluded the TesseractOCRParser from the default parser.
Tesseract may dramatically slow down content extraction (TIKA-2359).
As of Tika 1.15 (and prior versions), Tesseract is automatically called.
In future versions of Tika, users may need to turn the TesseractOCRParser on
via TikaConfig.
Jul 09, 2018 10:18:53 AM
org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
WARNING: org.xerial's sqlite-jdbc is not loaded.
Please provide the jar on your classpath to parse sqlite files.
See tika-parsers/pom.xml for the correct version.
INFO Starting Apache Tika 2.0.0-SNAPSHOT server
INFO Using custom config:
tika-dl/src/test/resources/org/apache/tika/dl/imagerec/dl4j-inception3-config.xml
INFO Cache doesn't exist. Going to make a copy
INFO This might take a while! GET
https://github.com/USCDataScience/tika-dockers/releases/download/v0.2/inception_v3_keras_2.h5
INFO Cache doesn't exist. Going to make a copy
INFO This might take a while! GET
https://github.com/USCDataScience/tika-dockers/releases/download/v0.2/imagenet_class_index.json
INFO Going to load Inception network...
INFO Unexpected end-of-input: expected close marker for OBJECT (from
[Source: {"config": {"output_layers": [["predictions", 0, 0]], "layers":
[{"class_name": "InputLayer", "name": "input_1", "config":
{"batch_input_shape": [null, null, null, 3], "dtype": "float32", "sparse":
false, "name": "input_1"}, "inbound_nodes": []}, {"class_name": "Conv2D",
"name": "conv2d_1", "config": {"activity_regularizer": null, "strides": [2, 2],
"padding": "valid", "kernel_regularizer": null, "kernel_initializer":
{"class_name": "VarianceScaling", "config": {"scale": 1.0, "distribution":
"uniform", "mode": "fan_avg", "seed": null}}, "data_format": "channels_last",
"activation": "linear", "bias_regularizer": null, "kernel_size": [3, 3],
"dilation_rate": [1, 1], "use_bias": false, "trainable": true,
"kernel_constraint": null, "bias_constraint": null, "bias_initializer":
{"class_name": "Zeros", "config": {}}, "filters": 32, "name": "conv2d_1"},
"inbound_nodes": [[["input_1", 0, 0, {}]]]}, {"class_name":
"BatchNormalization", "name": "batch_normalization_1", "config": {"center":
true, "gamma_initializer": {"class_name": "Ones", "config": {}},
"beta_constraint": null, "gamma_constraint": null,
"moving_variance_initializer": {"class_name": "Ones", "config": {}},
"moving_mean_initializer": {"class_name": "Zeros", "config": {}}, "scale":
false, "momentum": 0.99, "gamma_regularizer": null, "trainable": true,
"epsilon": 0.001, "axis": 3, "beta_initializer": {"class_name": "Zeros",
"config": {}}, "beta_regularizer": null, "name": "batch_normalization_1"},
"inbound_nodes": [[["conv2d_1", 0, 0, {}]]]}, {"class_name": "Activation",
"name": "activation_1", "config": {"activation": "relu", "trainable":
....suppressed
{"activity_regularizer": null, "strides": [1, 1], "padding": "same",
"kernel_regularizer": null, "kernel_i; line: 1, column: 64001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@7c711375; line: 1, column: 36001]
INFO Unexpected end-of-input within/between OBJECT entries
at [Source: java.io.StringReader@57cf54e1; line: 1, column: 40001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@5b03b9fe; line: 1, column: 44001]
INFO Unrecognized token 'tru': was expecting 'null', 'true', 'false' or NaN
at [Source: java.io.StringReader@37d4349f; line: 1, column: 56001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@434a63ab; line: 1, column: 52001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@6e0f5f7f; line: 1, column: 56001]
INFO Unexpected end-of-input: expected close marker for ARRAY (from
[Source: java.io.StringReader@2805d709; line: 1, column: 45999])
at [Source: java.io.StringReader@2805d709; line: 1, column: 60001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@3ee37e5a; line: 1, column: 64001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@2ea41516; line: 1, column: 68001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@3a44431a; line: 1, column: 72001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@3c7f66c4; line: 1, column: 76001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@194bcebf; line: 1, column: 80001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@17497425; line: 1, column: 84001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@f0da945; line: 1, column: 88001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@4803b726; line: 1, column: 92001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@ffaa6af; line: 1, column: 96001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@53ce1329; line: 1, column: 68001]
INFO Unexpected end-of-input: expected close marker for ARRAY (from
[Source: java.io.StringReader@316bcf94; line: 1, column: 67972])
at [Source: java.io.StringReader@316bcf94; line: 1, column: 80001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@6404f418; line: 1, column: 76001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@3e11f9e9; line: 1, column: 80001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@1de5f259; line: 1, column: 84001]
INFO Unexpected end-of-input within/between OBJECT entries
at [Source: java.io.StringReader@729d991e; line: 1, column: 88001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@31fa1761; line: 1, column: 92001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@957e06; line: 1, column: 96001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@32502377; line: 1, column: 100001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@2c1b194a; line: 1, column: 104001]
INFO Unexpected end-of-input within/between ARRAY entries
at [Source: java.io.StringReader@4dbb42b7; line: 1, column: 108001]
INFO Unexpected end-of-input within/between OBJECT entries
at [Source: java.io.StringReader@66f57048; line: 1, column: 112001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@550dbc7a; line: 1, column: 116001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@21282ed8; line: 1, column: 120001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@36916eb0; line: 1, column: 124001]
INFO Unrecognized token 'fals': was expecting 'null', 'true', 'false' or NaN
at [Source: java.io.StringReader@7bab3f1a; line: 1, column: 160001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@437da279; line: 1, column: 100001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@23c30a20; line: 1, column: 104001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@1e1a0406; line: 1, column: 108001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@3cebbb30; line: 1, column: 112001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@12aba8be; line: 1, column: 116001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@290222c1; line: 1, column: 120001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@67f639d3; line: 1, column: 124001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@6253c26; line: 1, column: 128001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@49049a04; line: 1, column: 132001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@71a8adcf; line: 1, column: 136001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@27462a88; line: 1, column: 140001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@82de64a; line: 1, column: 144001]
INFO Unexpected end-of-input within/between ARRAY entries
at [Source: java.io.StringReader@659499f1; line: 1, column: 148001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@51e69659; line: 1, column: 152001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@47e2e487; line: 1, column: 156001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@201a4587; line: 1, column: 160001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@61001b64; line: 1, column: 132001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@4310d43; line: 1, column: 136001]
INFO Unexpected end-of-input in FIELD_NAME
at [Source: java.io.StringReader@54a7079e; line: 1, column: 140001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@26e356f0; line: 1, column: 144001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@47d9a273; line: 1, column: 148001]
INFO Unexpected end-of-input: was expecting closing quote for a string value
at [Source: java.io.StringReader@4b8ee4de; line: 1, column: 152001]
INFO Unexpected end-of-input: was expecting closing '"' for name
at [Source: java.io.StringReader@27f981c6; line: 1, column: 156001]
INFO Unexpected end-of-input: expected close marker for OBJECT (from
[Source: java.io.StringReader@1b11171f; line: 1, column: 143942])
at [Source: java.io.StringReader@1b11171f; line: 1, column: 160001]
INFO Unrecognized token 'tru': was expecting 'null', 'true', 'false' or NaN
at [Source: java.io.StringReader@1151e434; line: 1, column: 182001]
INFO Loaded [CpuBackend] backend
INFO Number of threads used for NativeOps: 2
INFO Number of threads used for BLAS: 2
INFO Backend used: [CPU]; OS: [Mac OS X]
INFO Cores: [4]; Memory: [3.6GB];
INFO Blas vendor: [MKL]
INFO Starting ComputationGraph with WorkspaceModes set to [training:
ENABLED; inference: ENABLED], cacheMode set to [NONE]
INFO Loaded the Inception model. Time taken=2657ms
INFO Recogniser = org.apache.tika.dl.imagerec.DL4JInceptionV3Net
INFO Recogniser Available = true
INFO Setting the server's publish address to be http://localhost:9998/
INFO jetty-8.y.z-SNAPSHOT
INFO Started SelectChannelConnector@localhost:9998
INFO Started Apache Tika server at http://localhost:9998/
INFO rmeta (autodetecting type)
INFO Time taken 1014ms
INFO Add RecognisedObject{label='lion' (en), id='291',
confidence=0.9375439286231995}
```
## Inception Client
```
nonas:imagerec mattmann$ curl -T lion.jpg http://localhost:9998/rmeta |
python -mjson.tool
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 45617 0 1176 100 44441 920 34790 0:00:01 0:00:01 --:--:--
34801
[
{
"Content-Type": "image/jpeg",
"OBJECT": "lion (0.93754)",
"X-Parsed-By": [
"org.apache.tika.parser.CompositeParser",
"org.apache.tika.parser.recognition.ObjectRecognitionParser"
],
"X-TIKA:content": "<html
xmlns=\"http://www.w3.org/1999/xhtml\">\n<head>\n<meta
name=\"org.apache.tika.parser.recognition.object.rec.impl\"
content=\"org.apache.tika.dl.imagerec.DL4JInceptionV3Net\" />\n<meta
name=\"X-Parsed-By\" content=\"org.apache.tika.parser.CompositeParser\"
/>\n<meta name=\"X-Parsed-By\"
content=\"org.apache.tika.parser.recognition.ObjectRecognitionParser\"
/>\n<meta name=\"OBJECT\" content=\"lion (0.93754)\" />\n<meta
name=\"Content-Type\" content=\"image/jpeg\"
/>\n<title></title>\n</head>\n<body><ol id=\"objects\">\t<li id=\"291\"> lion
[en](confidence = 0.937544)</li>\n</ol>\n</body></html>",
"X-TIKA:parse_time_millis": "1086",
"org.apache.tika.parser.recognition.object.rec.impl":
"org.apache.tika.dl.imagerec.DL4JInceptionV3Net"
}
]
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Upgrade dl4j to 1.0.0-beta
> --------------------------
>
> Key: TIKA-2672
> URL: https://issues.apache.org/jira/browse/TIKA-2672
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Attachments: TIKA-2672.patch
>
>
> Let's try to upgrade dl4j. I think I got us most of the way there, but I got
> this error when reading the json config file. Can someone with more
> knowledge of layer specs help ([~thammegowda], perhaps :))?
> {noformat}
> org.deeplearning4j.exception.DL4JInvalidConfigException: Invalid
> configuration for layer (idx=-1, name=convolution2d_2, type=ConvolutionLayer)
> for width dimension: Invalid input configuration for kernel width. Require 0
> < kW <= inWidth + 2*padW; got (kW=3, inWidth=1, padW=0)
> Input type = InputTypeConvolutional(h=149,w=1,c=32), kernel = [3, 3], strides
> = [1, 1], padding = [0, 0], layer size (output channels) = 32, convolution
> mode = Truncate
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)