Re: [ccache] Combining multiple ccache into one

2018-03-19 Thread Jason Zhou via ccache
Thank you Anders for your response! More questions inline:

> On Mar 18, 2018, at 5:32 AM, Anders Björklund via ccache 
>  wrote:
> 
> Jason Zhou wrote:
>> I am looking for an efficient way to correctly combine multiple
>> ccache from hundreds of build machines into a single ccache to build
>> a super set ccache. We use 200+ autoscaled cloud machines in our
>> build farm and each machine builds a random subsets of the source
>> tree. ccache size on each machine is ~70GB and contains ~500K files.
>> Having a superset ccache pre-built in the cloud image will greatly
>> improve our build time.
> 
> Including a pre-populated cache in the OS image is a novel idea,
> but I wonder if you would have to resort to that "workaround" ?
> 
> You could keep a local cache, and sync it from a "secondary cache".
> We have some code for this, but none of it is up to sync with master.

We are doing this because we need to launch new cloud instances fast to quickly 
response to build requests. Sync’ing from a secondary cache will take much 
longer.

> 
>> I noticed the same ccache filename (*.o, *.manifest, *.d) not
>> necessarily has the same content (md5sum) on different machines and
>> wonder if rsync is the right tool to do this, or is it feasible at
>> all to combine ccache.
> 
> This is normal. The created files might have different timestamps
> and such, that makes their checksum different. But they are supposed
> to be interchangable, so none of those differences should *matter*
> (if it does, then we are missing to hash something important…)

So ccache files are interchangeable as long as they have the same filenames 
(hash name) and this applies to all ccache files: *.o, *.manigest, *.d? If so 
we can use “rsync -a —ignore-existing” to combine multiple ccache much faster 
(no need to check file timestamp and size).

> 
>> I am trying to avoid ccache on NFS mount due to number of machines we
>> are dealing with and performance of NFS is not promising. 
> 
> Have you tried out the memcached version ? It was developed for
> that reason... You can have a cluster of such servers, if needed.
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ccache_ccache_tree_dev_memcached=DwICaQ=0ia8zh_eZtQM1JEjWgVLZg=ByzQxNeEuzdGEv2mSxiU8HCyQpz_6zJt6ZhQ1YL1R9w=uaT5rzgzrAj-kp2ARYq2pnY6oVlST6ntGbXXVIOWZxE=5-mcUIFb73nP0jUMbgxUBzXqFo4sWO9fAWzuFwV_yDY=

I’d love to explore memcached version. The document link above doesn’t mention 
any memcached info, do you have a link that describes how to use memcache for 
ccache?

Thanks again for your help!

Jason

> 
> To further scale out, one can keep a local memcached proxy ("moxi")
> and have the cluster be disk-backed (using couchbase) for restarts.
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.couchbase.com_memcached=DwICaQ=0ia8zh_eZtQM1JEjWgVLZg=ByzQxNeEuzdGEv2mSxiU8HCyQpz_6zJt6ZhQ1YL1R9w=uaT5rzgzrAj-kp2ARYq2pnY6oVlST6ntGbXXVIOWZxE=Hl8l6FfL0005SPe57kPTjCbvO3jTXHK4u3Irx1L5LCs=
> 
> /Anders
> ___
> ccache mailing list
> ccache@lists.samba.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.samba.org_mailman_listinfo_ccache=DwICaQ=0ia8zh_eZtQM1JEjWgVLZg=ByzQxNeEuzdGEv2mSxiU8HCyQpz_6zJt6ZhQ1YL1R9w=uaT5rzgzrAj-kp2ARYq2pnY6oVlST6ntGbXXVIOWZxE=RN0I5x9rfUNh6hpFtn0HPVXATIe0-Zx31xa9kF56aFk=


___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Combining multiple ccache into one

2018-03-18 Thread Anders Björklund via ccache
Den 2018-03-14 kl. 19:31, skrev Basile Starynkevitch via ccache:
> 
> 
> On 03/14/2018 06:54 PM, Jason Zhou via ccache wrote:
>> Hi,
>>
>> I am looking for an efficient way to correctly combine multiple ccache from 
>> hundreds of build machines into a single ccache to build a super set ccache.
> 
> perhaps you should consider distcc https://github.com/distcc/distcc or 
> icecream https://github.com/icecc/icecream
> 

The use of distcc is orthogonal to the use of ccache,
it only applies in the case of a cache miss (compile)

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Combining multiple ccache into one

2018-03-14 Thread Jason Zhou via ccache
Hi,

I am looking for an efficient way to correctly combine multiple ccache from 
hundreds of build machines into a single ccache to build a super set ccache. We 
use 200+ autoscaled cloud machines in our build farm and each machine builds a 
random subsets of the source tree. ccache size on each machine is ~70GB and 
contains ~500K files. Having a superset ccache pre-built in the cloud image 
will greatly improve our build time.

I have tried the following rsync commands to combine ccache and like to get 
your expert opinions and suggestions on the correctness and efficiency of the 
process:

rsync command 1:

"rsync -a --exclude=**.tmp.*" is very slow and takes hours to combine ccache.

rsync command 2:

"rsync -a --exclude=**.tmp.* --ignore-existing” is very fast.

I noticed the same ccache filename (*.o, *.manifest, *.d) not necessarily has 
the same content (md5sum) on different machines and wonder if rsync is the 
right tool to do this, or is it feasible at all to combine ccache.

Thanks in advance!

Jason


P.S.

I am trying to avoid ccache on NFS mount due to number of machines we are 
dealing with and performance of NFS is not promising.
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache