[
https://issues.apache.org/jira/browse/ORC-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506178#comment-16506178
]
Wes McKinney commented on ORC-374:
----------------------------------
Apologies, the artifact that I pulled was from
https://github.com/apache/orc/releases. I will open an Arrow JIRA about using
the source release artifact (this was an oversight from the original patch
adding ORC read support). We are going to add support for offline builds
(ARROW-902) to help mitigate the issue -- our cmake configuration currently
expects to be able to download thirdparty dependencies from the internet during
the build process, which can be an issue on slow internet (e.g. airplanes).
> Possible to reduce size of release tarballs?
> --------------------------------------------
>
> Key: ORC-374
> URL: https://issues.apache.org/jira/browse/ORC-374
> Project: ORC
> Issue Type: Improvement
> Reporter: Wes McKinney
> Priority: Major
>
> We are building the Apache ORC C++ library as a dependency of Apache Arrow. I
> have noticed that the latest release tarball for ORC is about 13 MB.
> It looks like is caused by a combination of
> * Data files used for testing
> * Generated Javadoc
> Here's the {{du}} output
> {code}
> $ du -d 2 -h .
> 14M ./examples/expected
> 23M ./examples
> 12K ./proto
> 48K ./cmake_modules
> 40K ./site/develop
> 12K ./site/security
> 18M ./site/api
> 24K ./site/_layouts
> 16K ./site/_data
> 16K ./site/js
> 468K ./site/img
> 8.0K ./site/help
> 116K ./site/specification
> 16K ./site/news
> 8.0K ./site/talks
> 520K ./site/fonts
> 24K ./site/_sass
> 88K ./site/_includes
> 120K ./site/_posts
> 108K ./site/_docs
> 32K ./site/css
> 20M ./site
> 8.0K ./docker/centos7
> 8.0K ./docker/centos6
> 8.0K ./docker/ubuntu16-clang5
> 8.0K ./docker/ubuntu12
> 8.0K ./docker/debian8
> 8.0K ./docker/debian7
> 8.0K ./docker/ubuntu14
> 8.0K ./docker/ubuntu16
> 76K ./docker
> 256K ./tools/test
> 56K ./tools/src
> 320K ./tools
> 8.0K ./.git/info
> 28K ./.git/refs
> 52K ./.git/hooks
> 32K ./.git/logs
> 4.0K ./.git/branches
> 22M ./.git/objects
> 22M ./.git
> 64K ./java/examples
> 260K ./java/mapreduce
> 2.3M ./java/core
> 472K ./java/tools
> 128K ./java/shims
> 356K ./java/bench
> 3.6M ./java
> 708K ./c++/test
> 104K ./c++/include
> 664K ./c++/src
> 948K ./c++/libs
> 2.5M ./c++
> 71M .
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)