Hi Tim, On Thu, Feb 12, 2015 at 6:17 PM, <[email protected]> wrote:
> > All, > I just noticed that tika-app has gone from ~30MB to ~44MB, ~20k file to > ~27k files. 3.5 of those new MB are for README.NLDAS1.pdf and > README.NLDAS2.pdf. Can we exclude those in the app and server? Are there > other items that we should exclude? > > > I knew that this was large, I didn't realize that the impact was quite so significant. Some of the underlying scientific libraries are pretty cumbersome that is for sure. There was a comment made about how do we fix this.... well this may be a case of me heading down to Unidata folks (who are maintaining these libraries) and sending them some patches to clean this stuff up. Annie has been making excellent progress with relationship building between the two communities... I'll need to leverage some of her charm in a bid to get them to remove these obscene PDF's (as a start). I'll update here with my findings. Hopefully I can get it to a stage that they publish a skinny jar or sorts!!! Thanks Lewis
