Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-25 Thread Mats Wichmann
On 5/24/19 5:50 PM, Bill Deegan wrote:
> Mats,
> 
> What builders can't you cache?
> 
> -Bill
> 
> On Fri, May 24, 2019 at 3:56 PM Mats Wichmann  > wrote:
> 
> For all you guys, is the current caching - all filesystem based -
> useful enough? I've been chewing on a network based extension, for
> all those disposable builders that don't really have great ways to cache

I have a specific scenario I'm thinking about, but I'll put it in more
general terms:

if you have a CI system where each commit push to the review system
triggers a cascade of builds, and the system images to do the builds are
spun up for the build and then thrown away when done, then an
in-filesystem cache doesn't do them much good, right? Because there's no
persistent storage at least inside the image.  Some systems may have a
way to cache some things to avoid re-downloading, but I'm not sure that
extends to artifacts created by the build, as the scons cache and
sconsign database would be.

I the case I've worked on, the system is essentially homegrown and I
know the "download cache" is a hack and wouldn't extend to this usage.

See here for an example of another project addressing this:

https://docs.bazel.build/versions/master/remote-caching.html


But I asked the question (half-hijacking the thread, sorry) because I'm
not sure if my view of the scons use cases matches anoyne else's.
___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev


Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-25 Thread Adam Gross via Scons-dev
>> For all you guys, is the current caching - all filesystem based - useful 
>> enough? I've been chewing on a network
>> based extension, for all those disposable builders that don't really have 
>> great ways to cache

I am indeed finding that the built-in SCons caching isn’t conducive to network 
caching. I was preparing a separate e-mail about it but I’ll just include it 
here. Let me know if you want me to start a new thread for the discussion.

The basic summary is that the current cache implementation asks for the file 
when it needs it and doesn’t have any bulk frontload capability, so if I have 
5000 targets, I would have to do 5000 roundtrips to the server. That isn’t 
going to work for network caching, especially given that we want to integrate 
with virtual filesystems so we only hydrate the targets that are actually used 
by developers.

--- Background ---

One of the things we are working on at VMware is implementing remote caching 
using SCons. We are hoping to upstream as many of our changes as possible, so I 
am hoping to get ideas on how to do this right. I hope to send out a summary of 
the plans we have soon (dependency enforcement, remote caching, and 
platform-specific virtual filesystem integration), but for now I need help with 
one specific problem in our prototype: bulk cache frontloading.

--- Background on existing caching mechanism ---

Currently the SCons caching mechanism (as implemented by the Taskmaster and 
CacheDir classes) does just-in-time caching where SCons first asks a CacheDir 
object whether it has a target file in the cache. If it does, it proceeds onto 
the next target file in the targets list for that action (if any). If it 
doesn’t, it skips asking the cache for the rest of the target files in the 
targets list of that action and just runs the action. If all targets are 
retrieved from the cache, the action is not run.

--- Downsides of existing caching mechanism ---

This mechanism wouldn’t work for remote caching because it is 
latency-sensitive. If I am building 5000 files, populating the cache would 
require up to 5000 roundtrips to the cache server. If I have 100ms latency to 
the cache server, that is an overhead of 500 seconds.

--- What I’d like to do ---

I’d like to implement a --cache-frontload parameter that does two runs through 
the node graph:


  1.  An initial dry run where we generate the content signatures of all target 
files.
 *   This run culminates with a call to the CacheDir object, e.g. 
retrieve_all(allNodes) where “allNodes” contains (in the example from the 
previous section) 5000 entries, each of which has the result of 
get_cachedir_csig and the full file path.
  2.  A second run where we run any actions that were not fulfilled from cache.

I tried implementing this but ran into problems resetting the node graph 
between steps #1 and #2. Anything not retrieved from the cache needs to be 
reset to “pending” or “no state”, but ideally the cached children should be 
retained so we don’t need to scan the files again. The problems I am running 
into with resetting the node graph include:


  1.  Easily and quickly accessing all nodes that were iterated over during the 
dry run.
  2.  Resetting the state of all nodes not retrieved from cache.
  3.  Reversing seemingly destructive “end of lifecycle” actions from objects.

The most I could try to do was to remember all Node objects (including build 
targets and containing directories) and then do the following on each object:


  1.  node.set_state(SCons.Node.No_state)
  2.  node.clear()
  3.  node.clear_memoized_values()
  4.  node.executor_cleanup()

But it seems like a hack and I haven’t been able to get it to work well.

Has anyone tried doing something like this before? Any recommendations where to 
start?


From: Mats Wichmann 
Sent: Friday, May 24, 2019 5:51 PM
To: SCons developer list ; Andrew C. Morrow 

Cc: Adrian Oney ; Adam Gross 
Subject: Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

For all you guys, is the current caching - all filesystem based - useful 
enough? I've been chewing on a network based extension, for all those 
disposable builders that don't really have great ways to cache

On May 24, 2019 3:45:01 PM MDT, "Andrew C. Morrow" 
mailto:andrew.c.mor...@gmail.com>> wrote:

Hi Adam -

I'm working in this same area (caching and debug info handling) for the SCons 
based MongoDB build system, right now.

Overall, I am trying to move to a model on Windows that is more like using 
-gsplit-dwarf with the GNU tools, where every object file gets a (cacheable) 
.pdb, and then we link with /DEBUG:fastlink, and defer the final per 
library/executable PDB to a post link step by using mspdbcmf. This is similar 
to using dwp to package up the .dwo files.

You can see some of my very much work-in-progress state here: 
https://github.com/acmorrow/mongo/blob/S

Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-24 Thread Bill Deegan
Andrew & Adam,

Unless the discussion is proprietary, perhaps there would be value in
continuing it on list?
That would make it easier for others to follow the development process and
also in the future look back and understand some of the why?

-Bill

On Fri, May 24, 2019 at 3:45 PM Andrew C. Morrow 
wrote:

>
> Hi Adam -
>
> I'm working in this same area (caching and debug info handling) for the
> SCons based MongoDB build system, right now.
>
> Overall, I am trying to move to a model on Windows that is more like using
> -gsplit-dwarf with the GNU tools, where every object file gets a
> (cacheable) .pdb, and then we link with /DEBUG:fastlink, and defer the
> final per library/executable PDB to a post link step by using mspdbcmf.
> This is similar to using dwp to package up the .dwo files.
>
> You can see some of my very much work-in-progress state here:
> https://github.com/acmorrow/mongo/blob/SERVER-33661/site_scons/site_tools/separate_debug.py
>
> Unfortunately, I've encountered one showstopper issue for us:
> https://developercommunity.visualstudio.com/content/problem/573023/absolute-paths-for-associated-pdb-files-are-record.html,
> and I'm waiting to hear back on it.
>
> The next steps in my current approach would be to move the actions that
> produce the finalized .dwp, .dSYM, or .pdb file into separate builders,
> rather than adding them as actions to the .Program and .SharedLibrary
> builders. That would allow the build tasks to finalize the debug
> information to be executed separately, or not at all for developer builds
> where keeping the debug info in separated per-object files is sufficient.
>
> If you are interested, I'd be happy to collaborate (off-list initially?)
> to discuss some of the issues we have encountered and find a way to avoid
> duplication of effort. Improving the debug info handling situation is
> something I'm keenly interested in, as it is a major bottleneck in our
> build performance.
>
> Thanks,
> Andrew
>
>
>
> On Fri, May 24, 2019 at 3:45 PM Tomasz Gajewski  wrote:
>
>>
>> Adam Gross via Scons-dev  writes:
>>
>> > I am investigating better supporting caching with SCons at VMware and
>> > am trying to see if I can teach SCons about pdb files.
>>
>> Is there any problem for your use cases in using /Z7 option for
>> compilation? That tells the compiler to embed debug data in .obj file
>> like on linux. Then during linking pdb's are created. It works at least
>> for shared libraries and executables.
>>
>> Regards
>> Tomasz Gajewski
>>
>> ___
>> Scons-dev mailing list
>> Scons-dev@scons.org
>> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>>
> ___
> Scons-dev mailing list
> Scons-dev@scons.org
> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>
___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev


Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-24 Thread Bill Deegan
Mats,

What builders can't you cache?

-Bill

On Fri, May 24, 2019 at 3:56 PM Mats Wichmann  wrote:

> For all you guys, is the current caching - all filesystem based - useful
> enough? I've been chewing on a network based extension, for all those
> disposable builders that don't really have great ways to cache
>
>
> On May 24, 2019 3:45:01 PM MDT, "Andrew C. Morrow" <
> andrew.c.mor...@gmail.com> wrote:
>>
>>
>> Hi Adam -
>>
>> I'm working in this same area (caching and debug info handling) for the
>> SCons based MongoDB build system, right now.
>>
>> Overall, I am trying to move to a model on Windows that is more like
>> using -gsplit-dwarf with the GNU tools, where every object file gets a
>> (cacheable) .pdb, and then we link with /DEBUG:fastlink, and defer the
>> final per library/executable PDB to a post link step by using mspdbcmf.
>> This is similar to using dwp to package up the .dwo files.
>>
>> You can see some of my very much work-in-progress state here:
>> https://github.com/acmorrow/mongo/blob/SERVER-33661/site_scons/site_tools/separate_debug.py
>>
>> Unfortunately, I've encountered one showstopper issue for us:
>> https://developercommunity.visualstudio.com/content/problem/573023/absolute-paths-for-associated-pdb-files-are-record.html,
>> and I'm waiting to hear back on it.
>>
>> The next steps in my current approach would be to move the actions that
>> produce the finalized .dwp, .dSYM, or .pdb file into separate builders,
>> rather than adding them as actions to the .Program and .SharedLibrary
>> builders. That would allow the build tasks to finalize the debug
>> information to be executed separately, or not at all for developer builds
>> where keeping the debug info in separated per-object files is sufficient.
>>
>> If you are interested, I'd be happy to collaborate (off-list initially?)
>> to discuss some of the issues we have encountered and find a way to avoid
>> duplication of effort. Improving the debug info handling situation is
>> something I'm keenly interested in, as it is a major bottleneck in our
>> build performance.
>>
>> Thanks,
>> Andrew
>>
>>
>>
>> On Fri, May 24, 2019 at 3:45 PM Tomasz Gajewski  wrote:
>>
>>>
>>> Adam Gross via Scons-dev  writes:
>>>
>>> > I am investigating better supporting caching with SCons at VMware and
>>> > am trying to see if I can teach SCons about pdb files.
>>>
>>> Is there any problem for your use cases in using /Z7 option for
>>> compilation? That tells the compiler to embed debug data in .obj file
>>> like on linux. Then during linking pdb's are created. It works at least
>>> for shared libraries and executables.
>>>
>>> Regards
>>> Tomasz Gajewski
>>>
>>> ___
>>> Scons-dev mailing list
>>> Scons-dev@scons.org
>>> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> ___
> Scons-dev mailing list
> Scons-dev@scons.org
> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>
___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev


Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-24 Thread Mats Wichmann
For all you guys, is the current caching - all filesystem based - useful 
enough? I've been chewing on a network based extension, for all those 
disposable builders that don't really have great ways to cache


On May 24, 2019 3:45:01 PM MDT, "Andrew C. Morrow"  
wrote:
>Hi Adam -
>
>I'm working in this same area (caching and debug info handling) for the
>SCons based MongoDB build system, right now.
>
>Overall, I am trying to move to a model on Windows that is more like
>using
>-gsplit-dwarf with the GNU tools, where every object file gets a
>(cacheable) .pdb, and then we link with /DEBUG:fastlink, and defer the
>final per library/executable PDB to a post link step by using mspdbcmf.
>This is similar to using dwp to package up the .dwo files.
>
>You can see some of my very much work-in-progress state here:
>https://github.com/acmorrow/mongo/blob/SERVER-33661/site_scons/site_tools/separate_debug.py
>
>Unfortunately, I've encountered one showstopper issue for us:
>https://developercommunity.visualstudio.com/content/problem/573023/absolute-paths-for-associated-pdb-files-are-record.html,
>and I'm waiting to hear back on it.
>
>The next steps in my current approach would be to move the actions that
>produce the finalized .dwp, .dSYM, or .pdb file into separate builders,
>rather than adding them as actions to the .Program and .SharedLibrary
>builders. That would allow the build tasks to finalize the debug
>information to be executed separately, or not at all for developer
>builds
>where keeping the debug info in separated per-object files is
>sufficient.
>
>If you are interested, I'd be happy to collaborate (off-list
>initially?) to
>discuss some of the issues we have encountered and find a way to avoid
>duplication of effort. Improving the debug info handling situation is
>something I'm keenly interested in, as it is a major bottleneck in our
>build performance.
>
>Thanks,
>Andrew
>
>
>
>On Fri, May 24, 2019 at 3:45 PM Tomasz Gajewski  wrote:
>
>>
>> Adam Gross via Scons-dev  writes:
>>
>> > I am investigating better supporting caching with SCons at VMware
>and
>> > am trying to see if I can teach SCons about pdb files.
>>
>> Is there any problem for your use cases in using /Z7 option for
>> compilation? That tells the compiler to embed debug data in .obj file
>> like on linux. Then during linking pdb's are created. It works at
>least
>> for shared libraries and executables.
>>
>> Regards
>> Tomasz Gajewski
>>
>> ___
>> Scons-dev mailing list
>> Scons-dev@scons.org
>> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>>

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev


Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-24 Thread Andrew C. Morrow
Hi Adam -

I'm working in this same area (caching and debug info handling) for the
SCons based MongoDB build system, right now.

Overall, I am trying to move to a model on Windows that is more like using
-gsplit-dwarf with the GNU tools, where every object file gets a
(cacheable) .pdb, and then we link with /DEBUG:fastlink, and defer the
final per library/executable PDB to a post link step by using mspdbcmf.
This is similar to using dwp to package up the .dwo files.

You can see some of my very much work-in-progress state here:
https://github.com/acmorrow/mongo/blob/SERVER-33661/site_scons/site_tools/separate_debug.py

Unfortunately, I've encountered one showstopper issue for us:
https://developercommunity.visualstudio.com/content/problem/573023/absolute-paths-for-associated-pdb-files-are-record.html,
and I'm waiting to hear back on it.

The next steps in my current approach would be to move the actions that
produce the finalized .dwp, .dSYM, or .pdb file into separate builders,
rather than adding them as actions to the .Program and .SharedLibrary
builders. That would allow the build tasks to finalize the debug
information to be executed separately, or not at all for developer builds
where keeping the debug info in separated per-object files is sufficient.

If you are interested, I'd be happy to collaborate (off-list initially?) to
discuss some of the issues we have encountered and find a way to avoid
duplication of effort. Improving the debug info handling situation is
something I'm keenly interested in, as it is a major bottleneck in our
build performance.

Thanks,
Andrew



On Fri, May 24, 2019 at 3:45 PM Tomasz Gajewski  wrote:

>
> Adam Gross via Scons-dev  writes:
>
> > I am investigating better supporting caching with SCons at VMware and
> > am trying to see if I can teach SCons about pdb files.
>
> Is there any problem for your use cases in using /Z7 option for
> compilation? That tells the compiler to embed debug data in .obj file
> like on linux. Then during linking pdb's are created. It works at least
> for shared libraries and executables.
>
> Regards
> Tomasz Gajewski
>
> ___
> Scons-dev mailing list
> Scons-dev@scons.org
> https://pairlist2.pair.net/mailman/listinfo/scons-dev
>
___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev


Re: [Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-24 Thread Tomasz Gajewski


Adam Gross via Scons-dev  writes:

> I am investigating better supporting caching with SCons at VMware and
> am trying to see if I can teach SCons about pdb files.

Is there any problem for your use cases in using /Z7 option for
compilation? That tells the compiler to embed debug data in .obj file
like on linux. Then during linking pdb's are created. It works at least
for shared libraries and executables.

Regards
Tomasz Gajewski

___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev


[Scons-dev] Looking for help mapping Windows pdb semantics to SCons

2019-05-24 Thread Adam Gross via Scons-dev
I am investigating better supporting caching with SCons at VMware and am trying 
to see if I can teach SCons about pdb files. Sorry, this e-mail will be a bit 
long because the topic is quick convoluted.

Right now we do a fine job with pdb files for programs and shared libraries, 
because we register a separate node (e.g. "foo-debug") that is referenced by 
the packaging code. However, static libraries and standalone object file 
compilations are not handled well because of how Windows writes per-directory 
pdbs when we provide the /Fd flag (reference for that flag - 
https://docs.microsoft.com/en-us/cpp/build/reference/fd-program-database-file-name?view=vs-2019
 ).

As an example, if we have the following files in a directory:

topdir\subdir\a.cpp
topdir\subdir\b.cpp

Our subdirectory-related build system would set up both a.cpp and b.cpp to 
compile with the parameter /Fdbuild\intdir\topdir\subdir\subdir.pdb. The 
crucial point here is that the compiler will write to subdir.pdb when compiling 
both a.cpp and b.cpp. If the resulting object files are archived into a static 
library, the archiver will not actually touch that .pdb file. That is, writing 
that subdir.pdb file is done by the compiler when building each source object.

It was tempting to then say that subdir.pdb is a SideEffect of each compilation 
action, but we don't want that because (1) it would prevent those files from 
compiling in parallel and (2) it still wouldn't allow us to support retrieving 
that pdb from cache. Microsoft supports compiling these in parallel because 
writes to the pdb go through a separate process mspdbsrv.exe, which deals with 
synchronization of writes to that file.

So in this case, I can't find a way to treat these pdb's. There are two real 
use cases here:


  1.  We eventually link the object files from the subdirectory into a static 
library, which is then linked with other things into the dll/exe.
  2.  We don't link them together and instead link the various obj files into 
the dll/exe.

For #1, I can get away with considering the subdirectory's common pdb file to 
be another target along with the static library. So despite the archiver not 
actually touching subdir.pdb, we would still say that the archive action has a 
target of subdir.lib and subdir.pdb (cheating a bit, but it works).

For #2, I can't find a way to track this. Ideally there would be some sort of 
"meta-target" construct in SCons so that the compilation action for a.cpp and 
b.cpp can list subdir.pdb as another target, but AFAICT that would cause SCons 
to complain that the same target is defined in multiple places.

Any ideas or pointers to help?

Thanks,
Adam Gross
___
Scons-dev mailing list
Scons-dev@scons.org
https://pairlist2.pair.net/mailman/listinfo/scons-dev