[
https://issues.apache.org/jira/browse/ARROW-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neal Richardson updated ARROW-6407:
-----------------------------------
Description:
This will prevent issues like ARROW-6406
After ARROW-8266 ensures every _ep has at least two URLs this problem becomes
harder since we have a few classes of URL which don't necessarily overlap:
preferred (highest priority during build configuration), canonical (whence new
tarballs should be downloaded when versions get bumped), and backup. It's
difficult to represent this in a txt file.
A script should be available for automatically bumping versions (a
generalization of upload-boost.sh):
- update versions.txt, including checksums
- run download_dependencies.sh to get fresh archives
- run (or inline) trim-boost.sh to trim the boost archive
- update ursalabs-managed archives:
https://github.com/ursa-labs/thirdparty/releases/tag/latest
Checksums:
If a checksum is not provided to ExternalProject_Add it may re-download a
tarball even if that's not necessary (any time the generated build must be
modified, IIUC). Ideally we should provide a checksum for all _eps (and not
just Thrift) to cut down on unnecessary network access when building bundled.
ARROW-8222 introduces a subtlety: we now default to a trimmed boost archive
which contains only the 10% which we need, only falling back to a full boost
archive when that fails to download. This is faster but not equivalent to the
full boost archive so we can't provide a single checksum which matches both.
This will probably just mean that those cases are extra slow
was:
This will prevent issues like ARROW-6406
After ARROW-8266 ensures every _ep has at least two URLs this problem becomes
harder since we have a few classes of URL which don't necessarily overlap:
preferred (highest priority during build configuration), canonical (whence new
tarballs should be downloaded when versions get bumped), and backup. It's
difficult to represent this in a txt file.
A script should be available for automatically bumping versions (a
generalization of upload-boost.sh):
- update versions.txt, including checksums
- run download_dependencies.sh to get fresh archives
- run (or inline) trim-boost.sh to trim the boost archive
- update ursalabs-managed archives in https://dl.bintray.com/ursalabs/ and
https://github.com/ursa-labs/thirdparty/releases/tag/latest
Checksums:
If a checksum is not provided to ExternalProject_Add it may re-download a
tarball even if that's not necessary (any time the generated build must be
modified, IIUC). Ideally we should provide a checksum for all _eps (and not
just Thrift) to cut down on unnecessary network access when building bundled.
ARROW-8222 introduces a subtlety: we now default to a trimmed boost archive
which contains only the 10% which we need, only falling back to a full boost
archive when that fails to download. This is faster but not equivalent to the
full boost archive so we can't provide a single checksum which matches both.
This will probably just mean that those cases are extra slow
> [C++] Consolidate thirdparty bundle URLs, version bumping logic, etc
> --------------------------------------------------------------------
>
> Key: ARROW-6407
> URL: https://issues.apache.org/jira/browse/ARROW-6407
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Affects Versions: 0.16.0
> Reporter: Wes McKinney
> Priority: Major
> Fix For: 4.0.0
>
>
> This will prevent issues like ARROW-6406
> After ARROW-8266 ensures every _ep has at least two URLs this problem becomes
> harder since we have a few classes of URL which don't necessarily overlap:
> preferred (highest priority during build configuration), canonical (whence
> new tarballs should be downloaded when versions get bumped), and backup. It's
> difficult to represent this in a txt file.
> A script should be available for automatically bumping versions (a
> generalization of upload-boost.sh):
> - update versions.txt, including checksums
> - run download_dependencies.sh to get fresh archives
> - run (or inline) trim-boost.sh to trim the boost archive
> - update ursalabs-managed archives:
> https://github.com/ursa-labs/thirdparty/releases/tag/latest
> Checksums:
> If a checksum is not provided to ExternalProject_Add it may re-download a
> tarball even if that's not necessary (any time the generated build must be
> modified, IIUC). Ideally we should provide a checksum for all _eps (and not
> just Thrift) to cut down on unnecessary network access when building bundled.
> ARROW-8222 introduces a subtlety: we now default to a trimmed boost archive
> which contains only the 10% which we need, only falling back to a full boost
> archive when that fails to download. This is faster but not equivalent to the
> full boost archive so we can't provide a single checksum which matches both.
> This will probably just mean that those cases are extra slow
--
This message was sent by Atlassian Jira
(v8.3.4#803005)