RE: Tries on migrating Spark Linux arm64 Job from Jenkins to GitHub Actions

2022-01-09 Thread Mark Jens
Hi,

I'd like to give my +1 for more testing on Linux ARM64!
Too bad it is not well supported by Github Actions but the following looks
promising!

Kind regards,
Mark

On 2022/01/09 03:33:48 Yikun Jiang wrote:
> Hi, all
>
> I tried to verify the possibility of *Linux arm64 scheduled job *using
> self-hosted action, below is some progress and I would like to hear
> suggestion from you in the next step (continue or stop).
>
> Related JIRA: SPARK-35607
> 
>
> *## About self-hosted Github Action:*
> Currently, self-hosted action supported x64(Linux, macOS, Windows),
> ARM64(Linux only), ARM32(Linux only)
> <
https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#architectures
>
> .
>
> There is guidance on self-hosted runners from Apache Infra
> <
https://cwiki.apache.org/confluence/display/INFRA/GitHub+-+self-hosted+runners
>.
> The gap to enable self-hosted runner on Apache repo is resource security
> considerations, specifically, it's to prevent the self-hosted runner from
> being accessed by unallow users' PR. As info and suggestion from ASF, the
> apache/airflow team maintained a custom runner
> , and
> it's also used by apache/airflow in their CI. So, we could just use this
> directly.
>
> TLDR, what we needed is setup resource with custom runner, then enable
> these resources in self-hosted action.
>
> *## Test on self-hosted Github Action with custom runner:*
> Here is some tries on my local repo:
> 1. Spark Maven/SBT test:
> PR: https://github.com/apache/spark/pull/35088
> TEST: https://github.com/Yikun/spark/pull/51
> 2. PySpark test:
> PR: https://github.com/apache/spark/pull/35049
> TEST: https://github.com/Yikun/spark/pull/53
> 3. Pull request test on unallow user:
> TEST: https://github.com/Yikun/spark/pull/60
> The self-hosted runner will prevent the PR access the runner due to
> "Running job on worker spark-github-runner-0001 disallowed by security
> policy".
>
> *## Pros of self-hosted github aciton:*
> - Satisfy the simple demands of Linux arm64 sheduled jobs.
> - Reuse the main workflow of github action.
> - All changes are visible on github is easy to review.
> - Easy to migrate when official GA arm64 support ready.
>
> *## What's the next step:*
> * If we can also consider self-hosted action as optional, I will submit a
> JIRA on Apache Infra to request the token to continue, like:
> https://issues.apache.org/jira/browse/INFRA-21305
> * If we certainly think that self-hosted action is not a wise choice, I
> will try to find other way.
>
> There are also some initial discusson, just FYI:
> https://github.com/dongjoon-hyun/ApacheSparkGitHubActionImage/pull/6
>
> Regards,
> Yikun
>


Re: Tries on migrating Spark Linux arm64 Job from Jenkins to GitHub Actions

2022-01-09 Thread Dongjoon Hyun
You can volunteer to be in charge of that for our new infra because you are
PMC.

BTW, personally, I prefer to receive a fund-raising officially instead of
connecting to some unknown servers.

I'll leave the security issues to you, Holden.

Dongjoon

On Sat, Jan 8, 2022 at 8:15 PM Holden Karau  wrote:

> Personally I’d love to see us compiling and testing on Linux arm64 as well.
>
> On Sat, Jan 8, 2022 at 7:49 PM Yikun Jiang  wrote:
>
>> BTW, this is not intended to be in potential opposition to Apache Spark
>> Infra 2022 which dongjoon mentioned in "Apache Spark Jenkins Infra 2022".
>> It is just to share a possible way for the Linux arm64 scheduled job.
>>
>> Also, I think we should get a final conclusion about the attitude of
>> self-hosted action from the spark community for future reference.
>>
>> Regards,
>> Yikun
>>
>> Yikun Jiang  于2022年1月9日周日 11:33写道:
>>
>>> Hi, all
>>>
>>> I tried to verify the possibility of *Linux arm64 scheduled job *using
>>> self-hosted action, below is some progress and I would like to hear
>>> suggestion from you in the next step (continue or stop).
>>>
>>> Related JIRA: SPARK-35607
>>> 
>>>
>>> *## About self-hosted Github Action:*
>>> Currently, self-hosted action supported x64(Linux, macOS, Windows),
>>> ARM64(Linux only), ARM32(Linux only)
>>> 
>>> .
>>>
>>> There is guidance on self-hosted runners from Apache Infra
>>> .
>>> The gap to enable self-hosted runner on Apache repo is resource security
>>> considerations, specifically, it's to prevent the self-hosted runner from
>>> being accessed by unallow users' PR. As info and suggestion from ASF, the
>>> apache/airflow team maintained a custom runner
>>> , and
>>> it's also used by apache/airflow in their CI. So, we could just use this
>>> directly.
>>>
>>> TLDR, what we needed is setup resource with custom runner, then enable
>>> these resources in self-hosted action.
>>>
>>> *## Test on self-hosted Github Action with custom runner:*
>>> Here is some tries on my local repo:
>>> 1. Spark Maven/SBT test:
>>> PR: https://github.com/apache/spark/pull/35088
>>> TEST: https://github.com/Yikun/spark/pull/51
>>> 2. PySpark test:
>>> PR: https://github.com/apache/spark/pull/35049
>>> TEST: https://github.com/Yikun/spark/pull/53
>>> 3. Pull request test on unallow user:
>>> TEST: https://github.com/Yikun/spark/pull/60
>>> The self-hosted runner will prevent the PR access the runner due to
>>> "Running job on worker spark-github-runner-0001 disallowed by security
>>> policy".
>>>
>>> *## Pros of self-hosted github aciton:*
>>> - Satisfy the simple demands of Linux arm64 sheduled jobs.
>>> - Reuse the main workflow of github action.
>>> - All changes are visible on github is easy to review.
>>> - Easy to migrate when official GA arm64 support ready.
>>>
>>> *## What's the next step:*
>>> * If we can also consider self-hosted action as optional, I will submit
>>> a JIRA on Apache Infra to request the token to continue, like:
>>> https://issues.apache.org/jira/browse/INFRA-21305
>>> * If we certainly think that self-hosted action is not a wise choice, I
>>> will try to find other way.
>>>
>>> There are also some initial discusson, just FYI:
>>> https://github.com/dongjoon-hyun/ApacheSparkGitHubActionImage/pull/6
>>>
>>> Regards,
>>> Yikun
>>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


Re: Tries on migrating Spark Linux arm64 Job from Jenkins to GitHub Actions

2022-01-08 Thread Holden Karau
Personally I’d love to see us compiling and testing on Linux arm64 as well.

On Sat, Jan 8, 2022 at 7:49 PM Yikun Jiang  wrote:

> BTW, this is not intended to be in potential opposition to Apache Spark
> Infra 2022 which dongjoon mentioned in "Apache Spark Jenkins Infra 2022".
> It is just to share a possible way for the Linux arm64 scheduled job.
>
> Also, I think we should get a final conclusion about the attitude of
> self-hosted action from the spark community for future reference.
>
> Regards,
> Yikun
>
> Yikun Jiang  于2022年1月9日周日 11:33写道:
>
>> Hi, all
>>
>> I tried to verify the possibility of *Linux arm64 scheduled job *using
>> self-hosted action, below is some progress and I would like to hear
>> suggestion from you in the next step (continue or stop).
>>
>> Related JIRA: SPARK-35607
>> 
>>
>> *## About self-hosted Github Action:*
>> Currently, self-hosted action supported x64(Linux, macOS, Windows),
>> ARM64(Linux only), ARM32(Linux only)
>> 
>> .
>>
>> There is guidance on self-hosted runners from Apache Infra
>> .
>> The gap to enable self-hosted runner on Apache repo is resource security
>> considerations, specifically, it's to prevent the self-hosted runner from
>> being accessed by unallow users' PR. As info and suggestion from ASF, the
>> apache/airflow team maintained a custom runner
>> , and
>> it's also used by apache/airflow in their CI. So, we could just use this
>> directly.
>>
>> TLDR, what we needed is setup resource with custom runner, then enable
>> these resources in self-hosted action.
>>
>> *## Test on self-hosted Github Action with custom runner:*
>> Here is some tries on my local repo:
>> 1. Spark Maven/SBT test:
>> PR: https://github.com/apache/spark/pull/35088
>> TEST: https://github.com/Yikun/spark/pull/51
>> 2. PySpark test:
>> PR: https://github.com/apache/spark/pull/35049
>> TEST: https://github.com/Yikun/spark/pull/53
>> 3. Pull request test on unallow user:
>> TEST: https://github.com/Yikun/spark/pull/60
>> The self-hosted runner will prevent the PR access the runner due to
>> "Running job on worker spark-github-runner-0001 disallowed by security
>> policy".
>>
>> *## Pros of self-hosted github aciton:*
>> - Satisfy the simple demands of Linux arm64 sheduled jobs.
>> - Reuse the main workflow of github action.
>> - All changes are visible on github is easy to review.
>> - Easy to migrate when official GA arm64 support ready.
>>
>> *## What's the next step:*
>> * If we can also consider self-hosted action as optional, I will submit a
>> JIRA on Apache Infra to request the token to continue, like:
>> https://issues.apache.org/jira/browse/INFRA-21305
>> * If we certainly think that self-hosted action is not a wise choice, I
>> will try to find other way.
>>
>> There are also some initial discusson, just FYI:
>> https://github.com/dongjoon-hyun/ApacheSparkGitHubActionImage/pull/6
>>
>> Regards,
>> Yikun
>>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Tries on migrating Spark Linux arm64 Job from Jenkins to GitHub Actions

2022-01-08 Thread Yikun Jiang
BTW, this is not intended to be in potential opposition to Apache Spark
Infra 2022 which dongjoon mentioned in "Apache Spark Jenkins Infra 2022".
It is just to share a possible way for the Linux arm64 scheduled job.

Also, I think we should get a final conclusion about the attitude of
self-hosted action from the spark community for future reference.

Regards,
Yikun

Yikun Jiang  于2022年1月9日周日 11:33写道:

> Hi, all
>
> I tried to verify the possibility of *Linux arm64 scheduled job *using
> self-hosted action, below is some progress and I would like to hear
> suggestion from you in the next step (continue or stop).
>
> Related JIRA: SPARK-35607
> 
>
> *## About self-hosted Github Action:*
> Currently, self-hosted action supported x64(Linux, macOS, Windows),
> ARM64(Linux only), ARM32(Linux only)
> 
> .
>
> There is guidance on self-hosted runners from Apache Infra
> .
> The gap to enable self-hosted runner on Apache repo is resource security
> considerations, specifically, it's to prevent the self-hosted runner from
> being accessed by unallow users' PR. As info and suggestion from ASF, the
> apache/airflow team maintained a custom runner
> , and
> it's also used by apache/airflow in their CI. So, we could just use this
> directly.
>
> TLDR, what we needed is setup resource with custom runner, then enable
> these resources in self-hosted action.
>
> *## Test on self-hosted Github Action with custom runner:*
> Here is some tries on my local repo:
> 1. Spark Maven/SBT test:
> PR: https://github.com/apache/spark/pull/35088
> TEST: https://github.com/Yikun/spark/pull/51
> 2. PySpark test:
> PR: https://github.com/apache/spark/pull/35049
> TEST: https://github.com/Yikun/spark/pull/53
> 3. Pull request test on unallow user:
> TEST: https://github.com/Yikun/spark/pull/60
> The self-hosted runner will prevent the PR access the runner due to
> "Running job on worker spark-github-runner-0001 disallowed by security
> policy".
>
> *## Pros of self-hosted github aciton:*
> - Satisfy the simple demands of Linux arm64 sheduled jobs.
> - Reuse the main workflow of github action.
> - All changes are visible on github is easy to review.
> - Easy to migrate when official GA arm64 support ready.
>
> *## What's the next step:*
> * If we can also consider self-hosted action as optional, I will submit a
> JIRA on Apache Infra to request the token to continue, like:
> https://issues.apache.org/jira/browse/INFRA-21305
> * If we certainly think that self-hosted action is not a wise choice, I
> will try to find other way.
>
> There are also some initial discusson, just FYI:
> https://github.com/dongjoon-hyun/ApacheSparkGitHubActionImage/pull/6
>
> Regards,
> Yikun
>