lewismc commented on a change in pull request #2:
URL: https://github.com/apache/tika-docker/pull/2#discussion_r639888802



##########
File path: README.md
##########
@@ -11,7 +11,11 @@ There is a minimal version, which contains only Apache Tika 
and it's core depend
 * Italian
 * Spanish.
 
-To install more languages simply update the apt-get command to include the 
package containing the language you required, or include your own custom packs 
using an ADD command.
+To install more languages simply use `docker-build.sh` or manually using 
[docker 
--build-arg](https://docs.docker.com/engine/reference/commandline/build/#set-build-time-variables---build-arg)
+
+For see with version is supported by tesseract on official package:

Review comment:
       > For see with version is supported by tesseract on official package:
   
   Change to 
   
   > Obtain a list of official Tesseract packages by executing (on Linux):

##########
File path: docker-tool.sh
##########
@@ -58,13 +60,18 @@ test_docker_image() {
 shift $((OPTIND -1))
 subcommand=$1; shift
 version=$1; shift
+tesseract_languages=$1; shift
 
 case "$subcommand" in
   build)
+    build_args="--build-arg TIKA_VERSION=${version}"
+    if [[ ! -z "$tesseract_languages" ]]; then
+      build_args="$build_args --build-arg 
TESSERACT_LANGUAGES='${tesseract_languages}'"
+    fi
     # Build slim version with minimal dependencies
     docker build -t apache/tika:${version} --build-arg TIKA_VERSION=${version} 
- < minimal/Dockerfile --no-cache
     # Build full version with OCR, Fonts and GDAL
-    docker build -t apache/tika:${version}-full --build-arg 
TIKA_VERSION=${version} - < full/Dockerfile --no-cache
+    docker build -t apache/tika:${version}-full ${build_args} - < 
full/Dockerfile --no-cache

Review comment:
       @mhf-ir this is the same as @dameikle has suggested... 

##########
File path: docker-tool.sh
##########
@@ -21,11 +21,14 @@ while getopts ":h" opt; do
   case ${opt} in
     h )
       echo "Usage:"
-      echo "    docker-tool.sh -h                      Display this help 
message."
-      echo "    docker-tool.sh build <TIKA_VERSION>    Builds images for 
<TIKA_VERSION>."
-      echo "    docker-tool.sh test <TIKA_VERSION>     Tests images for 
<TIKA_VERSION>."
-      echo "    docker-tool.sh publish <TIKA_VERSION>  Publishes images for 
<TIKA_VERSION> to Docker Hub."
-      echo "    docker-tool.sh latest <TIKA_VERSION>   Tags images for 
<TIKA_VERSION> as latest on Docker Hub."
+      echo "    docker-tool.sh -h                                              
Display this help message."
+      echo "    docker-tool.sh build <TIKA_VERSION> [<TESSERACT_LANGUAGES>]    
Builds images for <TIKA_VERSION> via special [<TESSERACT_LANGUAGES>]."
+      echo "    docker-tool.sh test <TIKA_VERSION>                             
Tests images for <TIKA_VERSION>."
+      echo "    docker-tool.sh publish <TIKA_VERSION>                          
Publishes images for <TIKA_VERSION> to Docker Hub."
+      echo "    docker-tool.sh latest <TIKA_VERSION>                           
Tags images for <TIKA_VERSION> as latest on Docker Hub."
+      echo ""
+      echo "Note: [<TESSERACT_LANGUAGES>] is optional for full image,"
+      echo "      for change default tesseract-ocr packages."

Review comment:
       Change 
   
   > ...for change default tesseract-ocr packages.
   
   to
   
   > ...to customize various tesseract-ocr packages. Otherwise the default 
packages are installed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to