[ 
https://issues.apache.org/jira/browse/ORC-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500562#comment-16500562
 ] 

Owen O'Malley commented on ORC-374:
-----------------------------------

I assume you are pulling the source tarball from Maven central.

The official 1.5.1 tarball from [https://orc.apache.org/docs/releases.html] is 
13mb and expands into a 45mb directory.
{code:java}
owen@laptop> du -hs orc-1.5.1/*
8.0K    orc-1.5.1/CMakeLists.txt
 16K    orc-1.5.1/LICENSE
4.0K    orc-1.5.1/NOTICE
4.0K    orc-1.5.1/README.md
2.4M    orc-1.5.1/c++
 40K    orc-1.5.1/cmake_modules
 40K    orc-1.5.1/docker
 23M    orc-1.5.1/examples
3.2M    orc-1.5.1/java
8.0K    orc-1.5.1/proto
 16M    orc-1.5.1/site
308K    orc-1.5.1/tools{code}

Of that, the 23mb for the example ORC files and the 16mb site are  in theory 
separable, but they are useful for users to get as part of the release.

I didn't realize until recently that when I uploaded the release to maven 
central that it was making a source tarball at the top level with including 
transient files. Next time I'll make it from a clean directory straight from 
the tarball.



> Possible to reduce size of release tarballs?
> --------------------------------------------
>
>                 Key: ORC-374
>                 URL: https://issues.apache.org/jira/browse/ORC-374
>             Project: ORC
>          Issue Type: Improvement
>            Reporter: Wes McKinney
>            Priority: Major
>
> We are building the Apache ORC C++ library as a dependency of Apache Arrow. I 
> have noticed that the latest release tarball for ORC is about 13 MB. 
> It looks like is caused by a combination of 
> * Data files used for testing
> * Generated Javadoc
> Here's the {{du}} output
> {code}
> $ du -d 2 -h .
> 14M   ./examples/expected
> 23M   ./examples
> 12K   ./proto
> 48K   ./cmake_modules
> 40K   ./site/develop
> 12K   ./site/security
> 18M   ./site/api
> 24K   ./site/_layouts
> 16K   ./site/_data
> 16K   ./site/js
> 468K  ./site/img
> 8.0K  ./site/help
> 116K  ./site/specification
> 16K   ./site/news
> 8.0K  ./site/talks
> 520K  ./site/fonts
> 24K   ./site/_sass
> 88K   ./site/_includes
> 120K  ./site/_posts
> 108K  ./site/_docs
> 32K   ./site/css
> 20M   ./site
> 8.0K  ./docker/centos7
> 8.0K  ./docker/centos6
> 8.0K  ./docker/ubuntu16-clang5
> 8.0K  ./docker/ubuntu12
> 8.0K  ./docker/debian8
> 8.0K  ./docker/debian7
> 8.0K  ./docker/ubuntu14
> 8.0K  ./docker/ubuntu16
> 76K   ./docker
> 256K  ./tools/test
> 56K   ./tools/src
> 320K  ./tools
> 8.0K  ./.git/info
> 28K   ./.git/refs
> 52K   ./.git/hooks
> 32K   ./.git/logs
> 4.0K  ./.git/branches
> 22M   ./.git/objects
> 22M   ./.git
> 64K   ./java/examples
> 260K  ./java/mapreduce
> 2.3M  ./java/core
> 472K  ./java/tools
> 128K  ./java/shims
> 356K  ./java/bench
> 3.6M  ./java
> 708K  ./c++/test
> 104K  ./c++/include
> 664K  ./c++/src
> 948K  ./c++/libs
> 2.5M  ./c++
> 71M   .
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to