[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075602#comment-16075602 ]
ASF GitHub Bot commented on TIKA-2298: -------------------------------------- chrismattmann commented on issue #182: Creation of TIKA-2298 contributed by asmehra95- Import of vgg16 via Deeplearning4j into tika-dl URL: https://github.com/apache/tika/pull/182#issuecomment-313252275 So @asmehra95 @thammegowda I have been testing this out. I can't get the unit tests to pass. See below: ```bash LMC-053601:tika-dl mattmann$ history | grep export 546 export MAVEN_OPTS="-Xms2048m" 548 export MAVEN_OPTS="-Xmx3G" 550 history | grep export LMC-053601:tika-dl mattmann$ ``` ```bash [INFO] [INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ tika-dl --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 2 resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-compiler-plugin:3.2:compile (default-compile) @ tika-dl --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 2 source files to /Users/mattmann/tmp/tika1.15/tika-dl/target/classes [INFO] [INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ tika-dl --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 4 resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-compiler-plugin:3.2:testCompile (default-testCompile) @ tika-dl --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 2 source files to /Users/mattmann/tmp/tika1.15/tika-dl/target/test-classes [INFO] [INFO] --- maven-surefire-plugin:2.18.1:test (default-test) @ tika-dl --- [INFO] Surefire report directory: /Users/mattmann/tmp/tika1.15/tika-dl/target/surefire-reports ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.691 sec - in org.apache.tika.dl.imagerec.DL4JInceptionV3NetTest Running org.apache.tika.dl.imagerec.DL4JVGG16NetTest Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 130.047 sec <<< FAILURE! - in org.apache.tika.dl.imagerec.DL4JVGG16NetTest recognise(org.apache.tika.dl.imagerec.DL4JVGG16NetTest) Time elapsed: 130.047 sec <<< ERROR! java.lang.OutOfMemoryError: Cannot allocate new FloatPointer(102760448): totalBytes = 1G, physicalBytes = 2G at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:568) at org.bytedeco.javacpp.Pointer.init(Pointer.java:121) at org.bytedeco.javacpp.FloatPointer.allocateArray(Native Method) at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:68) at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:445) at org.nd4j.linalg.api.buffer.FloatBuffer.<init>(FloatBuffer.java:57) at org.nd4j.linalg.api.buffer.factory.DefaultDataBufferFactory.createFloat(DefaultDataBufferFactory.java:236) at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1301) at org.nd4j.linalg.factory.Nd4j.createBuffer(Nd4j.java:1275) at org.nd4j.linalg.api.ndarray.BaseNDArray.<init>(BaseNDArray.java:252) at org.nd4j.linalg.cpu.nativecpu.NDArray.<init>(NDArray.java:109) at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.create(CpuNDArrayFactory.java:247) at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4768) at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.toFlattened(CpuNDArrayFactory.java:502) at org.nd4j.linalg.factory.BaseNDArrayFactory.toFlattened(BaseNDArrayFactory.java:321) at org.nd4j.linalg.factory.Nd4j.toFlattened(Nd4j.java:1846) at org.deeplearning4j.nn.weights.WeightInitUtil.initWeights(WeightInitUtil.java:111) at org.deeplearning4j.nn.weights.WeightInitUtil.initWeights(WeightInitUtil.java:61) at org.deeplearning4j.nn.params.DefaultParamInitializer.createWeightMatrix(DefaultParamInitializer.java:145) at org.deeplearning4j.nn.params.DefaultParamInitializer.createWeightMatrix(DefaultParamInitializer.java:133) at org.deeplearning4j.nn.params.DefaultParamInitializer.init(DefaultParamInitializer.java:82) at org.deeplearning4j.nn.conf.layers.DenseLayer.instantiate(DenseLayer.java:56) at org.deeplearning4j.nn.conf.graph.LayerVertex.instantiate(LayerVertex.java:92) at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:370) at org.deeplearning4j.nn.graph.ComputationGraph.init(ComputationGraph.java:274) at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:483) at org.deeplearning4j.nn.modelimport.keras.KerasModel.getComputationGraph(KerasModel.java:471) at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelAndWeights(KerasModelImport.java:178) at org.deeplearning4j.nn.modelimport.keras.trainedmodels.TrainedModelHelper.loadModel(TrainedModelHelper.java:70) at org.apache.tika.dl.imagerec.DL4JVGG16Net.initialize(DL4JVGG16Net.java:102) at org.apache.tika.parser.recognition.ObjectRecognitionParser.initialize(ObjectRecognitionParser.java:101) at org.apache.tika.config.TikaConfig$XmlLoader.loadOne(TikaConfig.java:638) at org.apache.tika.config.TikaConfig$XmlLoader.loadOverall(TikaConfig.java:550) at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:187) at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:168) at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:161) at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:157) at org.apache.tika.dl.imagerec.DL4JVGG16NetTest.recognise(DL4JVGG16NetTest.java:31) Results : Tests in error: DL4JVGG16NetTest.recognise:31 ยป OutOfMemory Cannot allocate new FloatPointer(1... Tests run: 2, Failures: 0, Errors: 1, Skipped: 0 [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 02:19 min [INFO] Finished at: 2017-07-05T16:06:29-07:00 [INFO] Final Memory: 61M/1020M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project tika-dl: There are test failures. [ERROR] [ERROR] Please refer to /Users/mattmann/tmp/tika1.15/tika-dl/target/surefire-reports for the individual test results. [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException LMC-053601:tika-dl mattmann$ ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > To improve object recognition parser so that it may work without external > RESTful service setup > ----------------------------------------------------------------------------------------------- > > Key: TIKA-2298 > URL: https://issues.apache.org/jira/browse/TIKA-2298 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.14 > Reporter: Avtar Singh > Labels: ObjectRecognitionParser > Fix For: 1.16 > > Original Estimate: 672h > Remaining Estimate: 672h > > When ObjectRecognitionParser was built to do image recognition, there wasn't > good support for Java frameworks. All the popular neural networks were in > C++ or python. Since there was nothing that runs within JVM, we tried > several ways to glue them to Tika (like CLI, JNI, gRPC, REST). > However, this game is changing slowly now. Deeplearning4j, the most famous > neural network library for JVM, now supports importing models that are > pre-trained in python/C++ based kits [5]. > *Improvement:* > It will be nice to have an implementation of ObjectRecogniser that > doesn't require any external setup(like installation of native libraries or > starting REST services). Reasons: easy to distribute and also to cut the IO > time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)