On Wednesday, August 17, 2016 at 9:37:12 AM UTC-4, zdenop wrote:
>
> If there is other solution how to separate "must" part of the project with 
> "optional" data on github.com, please share it.
>

The issue is that this separation creates the problem I mentioned, that you 
can't simply clone the github repository and use it directly as your 
tessdata dir. Combining the two (optional and "must", as you say) would 
mean that some people would probably delete files from their clone to keep 
disk-space usage down. At least that's what I did.

My own process is to install tesseract via homebrew. That gets me a minimal 
set-up WRT the trained data files and means that I get updated upon major 
releases that make it to homebrew. Then I use the data files from github. 
This means that when tesseract gets updated via homebrew, I have to 
recreate the symlinks. Not a big deal, but not nothing either.

So it's a trade-off. Some people would likely modify their set-up in either 
case, either to copy or link files as now, or to delete them. My current 
thinking is that the latter would be preferable for me, but I recognize 
that not everyone will agree with that. I assume it's possible to have an 
installation via homebrew (or whatever) that ignores the "extra" data 
files, or possibly two separate installations, a minimal and a full one.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/08a84f40-bbbb-41d5-8456-6ea8c0252508%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to