[
https://issues.apache.org/jira/browse/ORC-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500562#comment-16500562
]
Owen O'Malley commented on ORC-374:
-----------------------------------
I assume you are pulling the source tarball from Maven central.
The official 1.5.1 tarball from [https://orc.apache.org/docs/releases.html] is
13mb and expands into a 45mb directory.
{code:java}
owen@laptop> du -hs orc-1.5.1/*
8.0K orc-1.5.1/CMakeLists.txt
16K orc-1.5.1/LICENSE
4.0K orc-1.5.1/NOTICE
4.0K orc-1.5.1/README.md
2.4M orc-1.5.1/c++
40K orc-1.5.1/cmake_modules
40K orc-1.5.1/docker
23M orc-1.5.1/examples
3.2M orc-1.5.1/java
8.0K orc-1.5.1/proto
16M orc-1.5.1/site
308K orc-1.5.1/tools{code}
Of that, the 23mb for the example ORC files and the 16mb site are in theory
separable, but they are useful for users to get as part of the release.
I didn't realize until recently that when I uploaded the release to maven
central that it was making a source tarball at the top level with including
transient files. Next time I'll make it from a clean directory straight from
the tarball.
> Possible to reduce size of release tarballs?
> --------------------------------------------
>
> Key: ORC-374
> URL: https://issues.apache.org/jira/browse/ORC-374
> Project: ORC
> Issue Type: Improvement
> Reporter: Wes McKinney
> Priority: Major
>
> We are building the Apache ORC C++ library as a dependency of Apache Arrow. I
> have noticed that the latest release tarball for ORC is about 13 MB.
> It looks like is caused by a combination of
> * Data files used for testing
> * Generated Javadoc
> Here's the {{du}} output
> {code}
> $ du -d 2 -h .
> 14M ./examples/expected
> 23M ./examples
> 12K ./proto
> 48K ./cmake_modules
> 40K ./site/develop
> 12K ./site/security
> 18M ./site/api
> 24K ./site/_layouts
> 16K ./site/_data
> 16K ./site/js
> 468K ./site/img
> 8.0K ./site/help
> 116K ./site/specification
> 16K ./site/news
> 8.0K ./site/talks
> 520K ./site/fonts
> 24K ./site/_sass
> 88K ./site/_includes
> 120K ./site/_posts
> 108K ./site/_docs
> 32K ./site/css
> 20M ./site
> 8.0K ./docker/centos7
> 8.0K ./docker/centos6
> 8.0K ./docker/ubuntu16-clang5
> 8.0K ./docker/ubuntu12
> 8.0K ./docker/debian8
> 8.0K ./docker/debian7
> 8.0K ./docker/ubuntu14
> 8.0K ./docker/ubuntu16
> 76K ./docker
> 256K ./tools/test
> 56K ./tools/src
> 320K ./tools
> 8.0K ./.git/info
> 28K ./.git/refs
> 52K ./.git/hooks
> 32K ./.git/logs
> 4.0K ./.git/branches
> 22M ./.git/objects
> 22M ./.git
> 64K ./java/examples
> 260K ./java/mapreduce
> 2.3M ./java/core
> 472K ./java/tools
> 128K ./java/shims
> 356K ./java/bench
> 3.6M ./java
> 708K ./c++/test
> 104K ./c++/include
> 664K ./c++/src
> 948K ./c++/libs
> 2.5M ./c++
> 71M .
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)