[
https://issues.apache.org/jira/browse/FLINK-15474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007531#comment-17007531
]
Chesnay Schepler commented on FLINK-15474:
------------------------------------------
I'm not too fond of the idea of downloading several releases for a single test;
particularly when running these locally on a maybe-not-so-powerful connection.
Sure, while we can use the DownloadCache infrastructure also in tests we can't
assume that everyone will set the appropriate properties when running tests.
My impressions is that this is mostly a matter of convenience. Going back to a
specific release, modifying several tests in different ways / setting various
properties, compiling things and copying them to the correct location.
If you have a POC that can automate this entire process, well then let's use
that to generate the snapshots. We'd still have the binary snapshots in the
repo, but that's less of a problem imo, and could also be solved by storing the
snapshots externally and downloading them as needed, which should be
significantly cheaper than fetching releases.
> In TypeSerializerUpgradeTestBase, create serializer snapshots "on demand"
> -------------------------------------------------------------------------
>
> Key: FLINK-15474
> URL: https://issues.apache.org/jira/browse/FLINK-15474
> Project: Flink
> Issue Type: Bug
> Components: API / Type Serialization System, Tests
> Affects Versions: 1.9.0, 1.10.0
> Reporter: Aljoscha Krettek
> Assignee: Aljoscha Krettek
> Priority: Major
> Fix For: 1.10.0
>
>
> Currently, we store binary snapshots in the repository for all the different
> serializer upgrade test configurations (see linked POC for which snapshots
> are there for just the POJO serializer). This is hard to maintain, because
> someone has to go back and generate snapshtos from previous Flink versions
> and add them to the repo when updating the tests for a new Flink version.
> It's also problematic from a repository perspective because we keep piling up
> binary snapshots.
> Instead, we can create a snapshot "on demand" from a previous Flink version
> by using a classloader that has the previous Flink jar.
> I created a POC which demonstrated the approach:
> [https://github.com/aljoscha/flink/tree/jit-serializer-test-base]. The
> advantage is that we don't need binary snapshots in the repo anymore,
> updating the tests to a newer Flink version should be as easy as adding a new
> migration version and Flink download url. The downside is that the test now
> downloads Flink releases (in the PoC this is done using the caching infra
> introduced for e2e tests), which is costly and also re-generates the
> snapshots for every test, which is also costly. The test time (minus
> downloading) goes up from about 300 ms to roughly 6 seconds. That's not
> something I would call a unit test. We could call these "integration tests"
> (or even e2e tests) and only run them nightly. Side note, we don't have test
> coverage for serializer upgrades from 1.8 and 1.9 currently, so even doing it
> nightly would be an improvement.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)