Hi 力召,

recently I deployed RBE server on AWS VMs, and it can work now. but now I 
have two doubts about Remote Cache.

I firstly tried to build the modules under "system/core/"
then removed the "out/soong/.intermediates/" and "out/target/" directory, 
and the RBE logs, then try to build the modules under "system/core/" again. 
This time, I though there is no need to execute the build actions, just 
download the cached action result from the RBE server, but it seems not, 
the  final log is as below. there are still remote executions.

  *RBE Stats: down 1.08 GB, up 0 B, 9000 remote executions*

How can I check whether Remote Cache works, is there any log about the 
GetActionResult call or any settings to enable this log?
Can I only use Remote Cache only (not inclute Remote Execution) when build 
android with some settings from the client end?
在2023年4月13日星期四 UTC+8 22:44:10<李力召> 写道:

>
> *    reclient[1a42b3d9-0a8c-4c83-860f-7fa398daf641]: 
> RemoteErrorResultStatus: failed to upload 
> /home/faqiang/android13/prebuilts/clang/host/linux-x86/clang-r450784d/bin/ld64.lld:
>  
> retry budget exhausted (6 attempts): context deadline exceeded*
>
>
> *I can see that the cas storage size keeps rising when dependencies are 
> bing transfrred from client end to server end, and then preceding logs 
> occur.*
>
> *what does this mean usually? Does it mean the dependencies fail to be 
> sent to the server end in time (i.e. network speed issue)? can I change the 
> 6 attempts to a higher number to relieve this and how can I do that?*
>
> The CAS need a high performance storage. Maybe ld64.lld is too large for 
> the http, or timeout.  Because the reproxy upload files in very high 
> concurrent connections. 
>
>
> *For the code I modified under "build/make", it is as below. I don't 
> understand what's the purpose of the "container-image=docker" platform 
> property. What is it used for?*
>
> This is proto-specific. It means "I want this command is executed in a 
> docker container".  If you don't care about it, you can ignore it.  
>
>
> On Friday, April 7, 2023 at 9:45:19 PM UTC+8 Faqiang Zhu wrote:
>
>> Hi   力召.
>>
>> Used to high performance machines, changed some configs on the service 
>> end, and modified the code under "build/make", I can now build a test 
>> module with a cpp file on the remote server.
>>
>> For the code I modified under "build/make", it is as below. I don't 
>> understand what's the purpose of the "container-image=docker" platform 
>> property. What is it used for?
>>
>>     diff --git a/core/rbe.mk b/core/rbe.mk
>>     index fd3427abf4..2baff7302c 100644
>>     --- a/core/rbe.mk
>>     +++ b/core/rbe.mk
>>     @@ -64,7 +64,7 @@ ifneq ($(filter-out false,$(USE_RBE)),)
>>          d8_exec_strategy := remote_local_fallback
>>        endif
>>
>>     -  platform := container-image=docker://
>> gcr.io/androidbuild-re-dockerimage/android-build-remoteexec-image@sha256:582efb38f0c229ea39952fff9e132ccbe183e14869b39888010dacf56b360d62
>>     +  platform :=
>>        cxx_platform := $(platform),Pool=$(cxx_pool)
>>        java_r8_d8_platform := $(platform),Pool=$(java_pool)
>>
>>
>>
>> it seems that two platforms properties are set in requests:
>> 1. Pool=default
>> 2.  container-image=docker://
>> gcr.io/androidbuild-re-dockerimage/android-build-remoteexec-image@sha256:582efb38f0c229ea39952fff9e132ccbe183e14869b39888010dacf56b360d62
>>
>> When start the worker on the server end, I can use "--platform 
>> Pool=default", but with "--platform Pool=default --platform 
>> container-image=docker://
>> gcr.io/androidbuild-re-dockerimage/android-build-remoteexec-image@sha256:582efb38f0c229ea39952fff9e132ccbe183e14869b39888010dacf56b360d62",
>>  
>> something went wrong. Also I don't know the purpose of this property to be 
>> set in my worker.
>>
>>
>> Best Regards,
>> Zhu Faqiang.
>> On Friday, April 7, 2023 at 12:52:04 PM UTC+8 Faqiang Zhu wrote:
>>
>>> Hi  力召.
>>>
>>> it may really related to the network speed, but I'm not sure about it.
>>>
>>> I switched to two high performance machines, one works as client end, 
>>> and one works as server end. this time, the cas storage size grows faster. 
>>> although it still fails at last, but this time, the fail log is from the 
>>> server end.
>>>
>>> Best Regards,
>>> Zhu Faqiang.
>>> 在2023年4月6日星期四 UTC+8 22:26:16<Faqiang Zhu> 写道:
>>>
>>>> Oh, thank you, 力召.
>>>>
>>>> last time I didn't wait for long enouth when the log blocks. also there 
>>>> are some issues with the serivce end, which made me misunderstand.
>>>>
>>>> Now there could be failure logs like below:
>>>>
>>>>
>>>>     reclient[1a42b3d9-0a8c-4c83-860f-7fa398daf641]: 
>>>> RemoteErrorResultStatus: failed to upload 
>>>> /home/faqiang/android13/prebuilts/clang/host/linux-x86/clang-r450784d/bin/ld64.lld:
>>>>  
>>>> retry budget exhausted (6 attempts): context deadline exceeded
>>>>
>>>>
>>>> I can see that the cas storage size keeps rising when dependencies are 
>>>> bing transfrred from client end to server end, and then preceding logs 
>>>> occur.
>>>>
>>>> what does this mean usually? Does it mean the dependencies fail to be 
>>>> sent to the server end in time (i.e. network speed issue)? can I change 
>>>> the 
>>>> 6 attempts to a higher number to relieve this and how can I do that?
>>>>
>>>> Best Regards,
>>>> Zhu Faqiang.
>>>> 在2023年4月4日星期二 UTC+8 22:23:27<李力召> 写道:
>>>>
>>>>> Sorry for a late reply.
>>>>>
>>>>>  * The client distribute actions to the service, the service 
>>>>> schedules the actions to the workers, the workers does the actions.*
>>>>> *    In android build system, there is a limitation of using the host 
>>>>> installed tools, many tools under "prebuilts/" directory like clang++ is 
>>>>> used, how can a worker get the environment as the local build?*
>>>>>
>>>>> No,  the reproxy in the aosp will send all the depends to the CAS as 
>>>>> normal input. So the worker can be very light weight.
>>>>>
>>>>> *    I set "RBE_CXX_EXEC_STRATEGY" to be "remote" then try to build a 
>>>>> test module with RBE, the log shows that it blocks on "clang++ 
>>>>> test_source_file.cpp", while on the service end, it can be known that 
>>>>> there 
>>>>> is input requests,  but the worker seems does nothing.*
>>>>> *    I guesss it's related to the clang++ tool, although I installed 
>>>>> clang++ on the worker machine. but android should use its own.*
>>>>>
>>>>> Maybe the worker is not completely implement the  remote api 
>>>>> <https://github.com/bazelbuild/remote-apis> . 
>>>>>
>>>>> On Saturday, March 4, 2023 at 3:08:04 AM UTC+8 Faqiang Zhu wrote:
>>>>>
>>>>>> Hi  力召,
>>>>>>
>>>>>> Thank you, now the reproxy can be started.
>>>>>>
>>>>>> then I have another issue:
>>>>>>     The client distribute actions to the service, the service 
>>>>>> schedules the actions to the workers, the workers does the actions.
>>>>>>     In android build system, there is a limitation of using the host 
>>>>>> installed tools, many tools under "prebuilts/" directory like 
>>>>>> clang++ is used, how can a worker get the environment as the local build?
>>>>>>
>>>>>>     I set "RBE_CXX_EXEC_STRATEGY" to be "remote" then try to build a 
>>>>>> test module with RBE, the log shows that it blocks on "clang++ 
>>>>>> test_source_file.cpp", while on the service end, it can be known that 
>>>>>> there 
>>>>>> is input requests,  but the worker seems does nothing.
>>>>>>     I guesss it's related to the clang++ tool, although I installed 
>>>>>> clang++ on the worker machine. but android should use its own.
>>>>>>
>>>>>> Best Regards,
>>>>>> Zhu Faqiang.
>>>>>> 在2023年2月23日星期四 UTC+8 02:28:59<李力召> 写道:
>>>>>>
>>>>>>> hi Faqiang 
>>>>>>>
>>>>>>> Reproxy is call the rbe service by https with credential .  You can 
>>>>>>> disable it by enviroment  "export RBE_service_no_security=true"
>>>>>>>
>>>>>>> On Wednesday, February 22, 2023 at 12:45:06 PM UTC+8 Faqiang Zhu 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm trying to build android13 with RBE.
>>>>>>>>
>>>>>>>> As suggested in this post: Build AOSP 11 with Google RBE service 
>>>>>>>> <https://groups.google.com/g/android-building/c/jOd1Z7C6xxk/m/v1os5xbKFgAJ>,
>>>>>>>>  
>>>>>>>> I am trying an alternative option of BuildGrid listed here - 
>>>>>>>> https://bazel.build/community/remote-execution-services.
>>>>>>>>
>>>>>>>> I setup the BuildGrid server based on the document, with bazel as 
>>>>>>>> the client to build C++ tutorial examples, the build action can be 
>>>>>>>> distributed from a machine to the GuildGrid Server, then I tried build 
>>>>>>>> android 13 with RBE and this BuildGrid server with below steps:
>>>>>>>>
>>>>>>>>    - modify the file "build/soong/docs/rbe.json" as below:
>>>>>>>>
>>>>>>>>     diff --git a/docs/rbe.json b/docs/rbe.json
>>>>>>>>     index f6ff10772..3f4c4ccf3 100644
>>>>>>>>     --- a/docs/rbe.json
>>>>>>>>     +++ b/docs/rbe.json
>>>>>>>>     @@ -10,8 +10,8 @@
>>>>>>>>              "RBE_R8": "1",
>>>>>>>>              "RBE_D8": "1",
>>>>>>>>
>>>>>>>>     -        "RBE_instance": "[replace with your RBE instance]",
>>>>>>>>     -        "RBE_service": "[replace with your RBE service 
>>>>>>>> endpoint]",
>>>>>>>>     +        "RBE_instance": "main",
>>>>>>>>     +        "RBE_service": "grpc://10.193.102.33:50051",
>>>>>>>>
>>>>>>>>              "RBE_DIR": "prebuilts/remoteexecution-client/live",
>>>>>>>>
>>>>>>>>
>>>>>>>>    - create a credential file of 
>>>>>>>>    "$HOME/.config/gcloud/application_default_credentials.json" with 
>>>>>>>> below 
>>>>>>>>    command:
>>>>>>>>
>>>>>>>>     gcloud auth application-default login --no-launch-browser 
>>>>>>>> --disable-quota-project
>>>>>>>>
>>>>>>>>
>>>>>>>>    - try to start the build with below commands:
>>>>>>>>
>>>>>>>>     ANDROID_BUILD_ENVIRONMENT_CONFIG=rbe 
>>>>>>>> ANDROID_BUILD_ENVIRONMENT_CONFIG_DIR=build/soong/docs make
>>>>>>>>
>>>>>>>>
>>>>>>>> but I got below failure and seems no related source code can be 
>>>>>>>> found:
>>>>>>>>
>>>>>>>>     18:58:52 Unable to start RBE reproxy
>>>>>>>>     FAILED: RBE bootstrap failed with: exit status 10
>>>>>>>>     E0221 18:58:52.597734 1344945 bootstrap.go:96] Unable to start 
>>>>>>>> reproxy: "E0221 18:58:50.166111 1344959 main.go:205] Failed to 
>>>>>>>> initialize 
>>>>>>>> remote-execution client: rpc error: code = Unavailable desc = rpc 
>>>>>>>> error: 
>>>>>>>> code = Unavailable desc = retry budget exhausted (6 attempts): all 
>>>>>>>> SubConns 
>>>>>>>> are in TransientFailure, authentication type (identity) 
>>>>>>>> used=\"application 
>>>>>>>> default credentials\"\n"
>>>>>>>>
>>>>>>>>     Try restarting the build after running the following command:
>>>>>>>>         gcloud auth application-default login --no-launch-browser 
>>>>>>>> --disable-quota-project
>>>>>>>>
>>>>>>>>
>>>>>>>> Dose anyone tried the alternative RE service options listed in 
>>>>>>>> https://bazel.build/community/remote-execution-services? 
>>>>>>>> what RE service is choosed? 
>>>>>>>> Is there similar or the same issue encountered as me? 
>>>>>>>> Are there any fixes for the issue I encountered?
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Zhu Faqiang.
>>>>>>>>
>>>>>>>

-- 
-- 
You received this message because you are subscribed to the "Android Building" 
mailing list.
To post to this group, send email to android-building@googlegroups.com
To unsubscribe from this group, send email to
android-building+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-building?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"Android Building" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to android-building+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/android-building/5c008daa-5a74-4bd0-ab19-d19a9da7eb1bn%40googlegroups.com.

Reply via email to