[ 
https://issues.apache.org/jira/browse/KUDU-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993254#comment-16993254
 ] 

Adar Dembo commented on KUDU-3007:
----------------------------------

Sorry for not responding earlier; been thinking about the proposal and how we 
can best leverage it.

To start, let me provide some context on how our builds and tests work today. 
Kudu testing is mostly in pre-commit, with some ad hoc testing performed by 
community members prior to a release. Despite the hostname, our 
jenkins.kudu.apache.org master and slaves aren't ASF infrastructure; they're 
GCP resources donated and managed by Cloudera. It consists of several GCP VMs 
running a smattering of Docker containers via Kubernetes. The source code for 
all of that infra can be found 
[here|https://github.com/cloudera/kudu-upstream-infra]. Builds are performed 
inside these containers. C++ and Java tests, however, use the [dist-test 
framework|https://github.com/cloudera/dist_test] to execute across a variety of 
GCP VMs in parallel. When a build is ready to execute tests, it submits them in 
bulk as a job to the dist-test framework hosted by Cloudera. Each job is broken 
down into a set of tasks (one per test) which are farmed out to a pool of VMs, 
autoscaling that pool as needed to accommodate the load.

So how can we integrate aarch64 resources into all of this? Some thoughts:
* The resources donated to builds.apache.org as part of INFRA-19369 aren't 
immediately available to us, since our Jenkins infra is separate from ASF's 
infra.
* We can certainly add your ARM VMs as Jenkins slaves to the Cloudera infra, 
provided that integrates cleanly with the [Kubernetes-based approach we 
use|https://github.com/cloudera/kudu-upstream-infra].
* Reusing dist-test will be challenging because GCP doesn't offer ARM virtual 
hardware at all, and some aspects of dist-test are hardcoded for GCP. That 
isn't to say it can't be done, but it'd require a non-trivial investment on 
your part to understand how dist-test works, modify it so it's suitable for 
your ARM VM pool, and host and manage a second dist-test deployment.
* Without dist-test, I wouldn't want ARM-based Kudu tests run in pre-commit as 
doing so would significantly increase the development feedback loop.
* So maybe the right approach is a separate Jenkins job in 
jenkins.kudu.apache.org that runs periodically, building Kudu and running tests 
in the new ARM slaves? The challenge there will be to surface failures loudly 
enough that regressions are caught and addressed promptly.
* Hooking our gerrit up to OpenLab CI is intriguing, but does that imply that 
the tests are run pre-commit and determine how to gate the change? If so, we'll 
have the same increased feedback loop problem I described earlier. If not, test 
results may be published back to gerrit well after the changes are merged, 
making them easy to ignore.
* Perhaps the path of least resistance is to stand up a completely separate 
build pipeline for Kudu in OpenLab CI. The only shared infrastructure would be 
build-support/jenkins/build-and-test.sh, the script used to run a build and 
some tests. It could run periodically, or it could run post-commit when a 
change is merged to master. Tests would run serially and could potentially take 
a while to complete. We'd just need to figure out how to surface the results 
back to some place where devs will notice.

Let me know what you think. I'm curious whether other Kudu developers more 
familiar with our infra and dist-test have any thoughts (cc [~tlipcon]).

> ARM/aarch64 platform support
> ----------------------------
>
>                 Key: KUDU-3007
>                 URL: https://issues.apache.org/jira/browse/KUDU-3007
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: liusheng
>            Priority: Critical
>
> As an import alternative of x86 architecture, Aarch64(ARM) architecture  is 
> currently the dominate architecture in small devices like phone, IOT devices, 
> security cameras, drones etc. And also, there are more and more hadware or 
> cloud vendor start to provide ARM resources, such as AWS, Huawei, Packet, 
> Ampere. etc. Usually, the ARM servers are low cost and more cheap than x86 
> servers, and now more and more ARM servers have comparative performance with 
> x86 servers, and even more efficient in some areas.
> We want to propose to add an Aarch64 CI for KUDU to promote the support for 
> KUDU on Aarch64 platforms. We are willing to provide machines to the current 
> CI system and manpower to mananging the CI and fxing problems that occours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to