[ 
https://issues.apache.org/jira/browse/ORC-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506178#comment-16506178
 ] 

Wes McKinney commented on ORC-374:
----------------------------------

Apologies, the artifact that I pulled was from 
https://github.com/apache/orc/releases. I will open an Arrow JIRA about using 
the source release artifact (this was an oversight from the original patch 
adding ORC read support). We are going to add support for offline builds 
(ARROW-902) to help mitigate the issue -- our cmake configuration currently 
expects to be able to download thirdparty dependencies from the internet during 
the build process, which can be an issue on slow internet (e.g. airplanes).

> Possible to reduce size of release tarballs?
> --------------------------------------------
>
>                 Key: ORC-374
>                 URL: https://issues.apache.org/jira/browse/ORC-374
>             Project: ORC
>          Issue Type: Improvement
>            Reporter: Wes McKinney
>            Priority: Major
>
> We are building the Apache ORC C++ library as a dependency of Apache Arrow. I 
> have noticed that the latest release tarball for ORC is about 13 MB. 
> It looks like is caused by a combination of 
> * Data files used for testing
> * Generated Javadoc
> Here's the {{du}} output
> {code}
> $ du -d 2 -h .
> 14M   ./examples/expected
> 23M   ./examples
> 12K   ./proto
> 48K   ./cmake_modules
> 40K   ./site/develop
> 12K   ./site/security
> 18M   ./site/api
> 24K   ./site/_layouts
> 16K   ./site/_data
> 16K   ./site/js
> 468K  ./site/img
> 8.0K  ./site/help
> 116K  ./site/specification
> 16K   ./site/news
> 8.0K  ./site/talks
> 520K  ./site/fonts
> 24K   ./site/_sass
> 88K   ./site/_includes
> 120K  ./site/_posts
> 108K  ./site/_docs
> 32K   ./site/css
> 20M   ./site
> 8.0K  ./docker/centos7
> 8.0K  ./docker/centos6
> 8.0K  ./docker/ubuntu16-clang5
> 8.0K  ./docker/ubuntu12
> 8.0K  ./docker/debian8
> 8.0K  ./docker/debian7
> 8.0K  ./docker/ubuntu14
> 8.0K  ./docker/ubuntu16
> 76K   ./docker
> 256K  ./tools/test
> 56K   ./tools/src
> 320K  ./tools
> 8.0K  ./.git/info
> 28K   ./.git/refs
> 52K   ./.git/hooks
> 32K   ./.git/logs
> 4.0K  ./.git/branches
> 22M   ./.git/objects
> 22M   ./.git
> 64K   ./java/examples
> 260K  ./java/mapreduce
> 2.3M  ./java/core
> 472K  ./java/tools
> 128K  ./java/shims
> 356K  ./java/bench
> 3.6M  ./java
> 708K  ./c++/test
> 104K  ./c++/include
> 664K  ./c++/src
> 948K  ./c++/libs
> 2.5M  ./c++
> 71M   .
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to