Hi Antoine,
> My question would be: what happens after the PR is merged? Are > developers supposed to keep the Bazel setup working in addition to > CMake? Or is there a dedicated maintainer (you? :-)) to fix regressions > when they happen? In the short term, I would be will to be a dedicated maintainer for Mac (and once I get Linux support working for that as well). I'd like to classify the support as very experimental (not advertise in documentation yet). If other devs find Bazel useful, I would expect others to help with maintenance naturally. If it gets too much for me to maintain, I'm willing to drop support completely, since it won't be a critical part of the build infrastructure. Once the setup is more complete, I would plan on adding a CI target for it as well. > Can you give an example of circular dependency? Can this be solved by > having more "type_fwd.h" headers for forward declarations of opaque types? I think the type_fwd.h might contribute to the problem. The solution would be more granular header/compilation units when possible (or combining targets appropriately). An example of the problem is expression.h/.cc and operation.h/.cc in the compute library. Because operation.cc depends on expression.h and expression.cc relies on expression.h there is cycle between the two targets. I fixed this by making a new header only target for expression.h, which the operation target depends on. Then the expression target depends on the operation target. An alternative approach would be to combine "expression.*" and "operation.*" into a single target. > (also, generally, it would be desirable to use more of these, since our > compile times have become egregious as of late - I'm currently > considering replacing my 8-core desktop CPU with a beefier one :-/) I'm not a huge fan of this approach in general, but since I haven't been able to contribute on a day-to-day basis to the C++ code base, I'll let the active contributors decide the best course here. I thought computer upgrades where something to look forward to ;) This sounds really like a bummer. Do you have to spell those out by > hand? Or is there some tool that infers dependencies and generates the > declarations for you? Yes, I had to spell them out by hand. There is an internal tool at Google that helps with it (I didn't use it for this PR). There has been some discussion of open-sourcing the tool [1], but I wouldn't expect it any time soon. Luckily things are fairly well modularized at the moment, so while painful, I still felt it was not tremendously painful. Another solution would be to have larger targets (e.g. one per directory) that use globs which would make it less painful, but this loses some of the benefits mentioned above. [1] https://github.com/bazelbuild/bazel/issues/6871 On Tue, Nov 26, 2019 at 1:27 AM Antoine Pitrou <[email protected]> wrote: > > Hi Micah, > > Le 26/11/2019 à 05:52, Micah Kornfield a écrit : > > > > After going through this exercise I put together a list of pros and cons > > below. > > > > I would like to hear from other devs: > > 1. Their opinions on setting this up as an alternative system (I'm > willing > > to invest some more time in it). > > 2. What people think the minimum bar for merging a PR like this should > be? > > My question would be: what happens after the PR is merged? Are > developers supposed to keep the Bazel setup working in addition to > CMake? Or is there a dedicated maintainer (you? :-)) to fix regressions > when they happen? > > > Pros: > > 1. Being able to run "bazel test python/..." and having compilation of > all > > python dependencies just work is a nice experience. > > 2. Because of the granular compilation units, it can improve developer > > velocity. Unit tests can depend only on the sub-components they are meant > > to test. They don't need to compile and relink arrow.so. > > 3. The built-in documentation it provides about visibility and > > relationships between components is nice (its uncovered some "interesting > > dependencies"). I didn't make heavy use of it, but its concept of > > "visibility" makes things more explicit about what external consumers > > should be depending on, and what inter-project components should depend > on > > (e.g. explicitly limit the scope of vendored code). > > 4. Extensions are essentially python, which might be easier to work with > > then CMake > > Those sound nice. > > > Cons: > > 1. Bazel is opinionated on C++ layout. In particular it requires some > > workarounds to deal with circular .h/.cc dependencies. The two main ways > > of doing this are either increasing the size of compilable units [4] to > > span all dependencies in the cycle, or creating separate > > header/implementation targets, I've used both strategies in the PR. One > > could argue that it would be nice to reduce circular dependencies in > > general. > > Can you give an example of circular dependency? Can this be solved by > having more "type_fwd.h" headers for forward declarations of opaque types? > > (also, generally, it would be desirable to use more of these, since our > compile times have become egregious as of late - I'm currently > considering replacing my 8-core desktop CPU with a beefier one :-/) > > > 4. It is more verbose to configure then CMake (each compilation unit > needs > > to be spelled out with dependencies). > > This sounds really like a bummer. Do you have to spell those out by > hand? Or is there some tool that infers dependencies and generates the > declarations for you? > > Regards > > Antoine. >
