Re: Stop using Perfkit Benchmarker tool in all tests?

Udi Meiri Mon, 08 Jul 2019 11:11:49 -0700

The Python 3 incompatibility is reason enough to move off of Perfkit. (+1)

On Mon, Jul 8, 2019 at 9:49 AM Mark Liu <mark...@apache.org> wrote:


> Thanks for summarizing this discussion and post in dev list. I was closely
> working on Python performance tests and those Perfkit problems are really
> painful. So +1 to remove Perfkit and also remove those tests that are no
> longer maintained.
>
> For #2 (Python performance tests), there are no special setup for them.
> The only missing part I can see is metrics collection and data upload to a
> shared storage (e.g. BigQuery), which is provided free in Perfkit
> framework. This seems common to all language, so wondering if a shared
> infra is possible.
>
> Mark
>
> On Wed, Jul 3, 2019 at 9:36 AM Lukasz Cwik <lc...@google.com> wrote:
>
>> Makes sense to me to move forward with your suggestion.
>>
>> On Wed, Jul 3, 2019 at 3:57 AM Łukasz Gajowy <lukasz.gaj...@gmail.com>
>> wrote:
>>
>>> Are there features in Perfkit that we would like to be using that we
>>>> aren't?
>>>>
>>>
>>> Besides the Kubernetes related code I mentioned above (that, I believe,
>>> can be easily replaced) I don't see any added value in having Perfkit. The
>>> Kubernetes parts could be replaced with a set of fine-grained Gradle tasks
>>> invoked by other high-level tasks and Jenkins job's steps. There also seem
>>> to be some Gradle + Kubernetes plugins out there that might prove useful
>>> here (no solid research in that area).
>>>
>>>
>>>> Can we make the integration with Perfkit less brittle?
>>>>
>>>
>>> There was an idea to move all beam benchmark's code from Perfkit (
>>> beam_benchmark_helper.py
>>> <https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/5680e174ad1799056b4b6d4a6600ef9f93fe39ad/perfkitbenchmarker/beam_benchmark_helper.py>
>>> , beam_integration_benchmark.py
>>> <https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/7cdcea2561c66baa838e3ce4d776236a248e6700/perfkitbenchmarker/linux_benchmarks/beam_integration_benchmark.py>)
>>> to beam repository and inject it to Perfkit every time we use it. However,
>>> that would require investing time and effort in doing that and it will
>>> still not solve the problems I listed above. It will also still require
>>> knowledge of how Perfkit works from Beam developers while we can avoid that
>>> and use the existing tools (gradle, jenkins).
>>>
>>> Thanks!
>>>
>>> pt., 28 cze 2019 o 17:31 Lukasz Cwik <lc...@google.com> napisał(a):
>>>
>>>> +1 for removing tests that are not maintained.
>>>>
>>>> Are there features in Perfkit that we would like to be using that we
>>>> aren't?
>>>> Can we make the integration with Perfkit less brittle?
>>>>
>>>> If we aren't getting much and don't plan to get much value in the short
>>>> term, removal makes sense to me.
>>>>
>>>> On Thu, Jun 27, 2019 at 3:16 AM Łukasz Gajowy <lgaj...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> moving the discussion to the dev list:
>>>>> https://github.com/apache/beam/pull/8919. I think that Perfkit
>>>>> Benchmarker should be removed from all our tests.
>>>>>
>>>>> Problems that we face currently:
>>>>>
>>>>>    1. Changes to Gradle tasks/build configuration in the Beam
>>>>>    codebase have to be reflected in Perfkit code. This required PRs to 
>>>>> Perfkit
>>>>>    which can last and the tests break due to this sometimes (no change in
>>>>>    Perfkit + change already there in beam = incompatibility). This is what
>>>>>    happened in PR 8919 (above),
>>>>>    2. Can't run in Python3 (depends on python 2 only library like
>>>>>    functools32),
>>>>>    3. Black box testing which hard to collect pipeline related
>>>>>    metrics,
>>>>>    4. Measurement of run time is inaccurate,
>>>>>    5. It offers relatively small elasticity in comparison with eg.
>>>>>    Jenkins tasks in terms of setting up the testing infrastructure 
>>>>> (runners,
>>>>>    databases). For example, if we'd like to setup Flink runner, and reuse 
>>>>> it
>>>>>    in consequent tests in one go, that would be impossible. We can easily 
>>>>> do
>>>>>    this in Jenkins.
>>>>>
>>>>> Tests that use Perfkit:
>>>>>
>>>>>    1.  IO integration tests,
>>>>>    2.  Python performance tests,
>>>>>    3.  beam_PerformanceTests_Dataflow (disabled),
>>>>>    4.  beam_PerformanceTests_Spark (failing constantly - looks not
>>>>>    maintained).
>>>>>
>>>>> From the IOIT perspective (1), only the code that setups/tears down
>>>>> Kubernetes resources is useful right now but these parts can be easily
>>>>> implemented in Jenkins/Gradle code. That would make Perfkit obsolete in
>>>>> IOIT because we already collect metrics using Metrics API and store them 
>>>>> in
>>>>> BigQuery directly.
>>>>>
>>>>> As for point 2: I have no knowledge of how complex the task would be
>>>>> (help needed).
>>>>>
>>>>> Regarding 3, 4: Those tests seem to be not maintained - should we
>>>>> remove them?
>>>>>
>>>>> Opinions?
>>>>>
>>>>> Thank you,
>>>>> Łukasz
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>

smime.p7s
Description: S/MIME Cryptographic Signature

Re: Stop using Perfkit Benchmarker tool in all tests?

Reply via email to