Thanks for the feedback - updated.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc
File src/compiler/register-allocator.cc (right):

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode354
src/compiler/register-allocator.cc:354: InvalidateWeightAndSize();
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
Can we initialize properly instead of (mis)using an
InvalidateSomething method
in the constructor?

It would be inefficient to perform weight and size calculations at this
stage. In fact, this reminds me, certain live range operations - like
handling spill operands - can be done upfront before even worrying about
(calculating) size and weights. I plan on working on that in upcoming
changes, so I'd prefer not to change something that sets me in the right
direction.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode888
src/compiler/register-allocator.cc:888: auto start = Start();
On 2015/06/11 05:30:28, Benedikt Meurer wrote:
Don't use auto here.

Done.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode909
src/compiler/register-allocator.cc:909: for (; pos != nullptr; pos =
pos->next()) {
On 2015/06/11 05:30:28, Benedikt Meurer wrote:
That looks kinda inefficient to me. Seems to be linear in the size of
the
program in the worst case. What does LLVM do here?

Mozilla, at least, does something similar. LLVM's code is more
complicated, but the high level description of the algorithm calls for a
calculation of use densities, and factoring in loop-ness is important in
ensuring hot ranges don't lose to colder ones.

We could, eventually, change the LiveRange model and the stages
computing them to pre-calculate certain components, for example the
loop-ness of use positions, if that information is readily available
earlier. But I believe that would be too much churn at this stage.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode2309
src/compiler/register-allocator.cc:2309: UsePosition* hint =
current->FirstHintPosition(&hint_register);
On 2015/06/11 05:30:28, Benedikt Meurer wrote:
Why did you change this?

Probably some experimentations, thanks for the catch - there's no reason
for this to be changed.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode2633
src/compiler/register-allocator.cc:2633: Invalidate();
On 2015/06/11 05:30:28, Benedikt Meurer wrote:
Why invalidate here?

Because we're at the end, we're out of matching conflicts. I want the
end iterator to have a canonical shape - valid storage, null query and
past-end pos - so that == and != work well. Hence calling Invalidate.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode2639
src/compiler/register-allocator.cc:2639: Invalidate();
On 2015/06/11 05:30:28, Benedikt Meurer wrote:
Same here

See above.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.cc#newcode2681
src/compiler/register-allocator.cc:2681: auto q_start = query_->start();
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
No auto here.

Done.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h
File src/compiler/register-allocator.h (right):

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode436
src/compiler/register-allocator.h:436: LiveRangeGroup* group() { return
group_; }
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
Nit: Make this method const.

Done.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode440
src/compiler/register-allocator.h:440: DCHECK(weight_ >= 0.0f);
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
Trivial getters should not have DCHECK's in them. Check on assignment
or
introduce a setter.

The point is that I want to make sure no attempt to access weight is
made without pre-calculating it. I'll call it GetWeight then.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode444
src/compiler/register-allocator.h:444: unsigned size() {
On 2015/06/11 05:30:30, Benedikt Meurer wrote:
This getter does perform work, so it's not a simple getter, hence it
must not
look like one. Please rename to GetSize().

Done.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode521
src/compiler/register-allocator.h:521: ZoneSet<LiveRange*> ranges_;
On 2015/06/11 05:30:30, Benedikt Meurer wrote:
I'm not sure ZoneSet is such a great datastructure for this purpose.

What would you propose as an alternative?

Full disclosure, I did consider a R x I matrix - where "R" is the number
of physical registers and "I" is the number of positions. That would
simplify a number of things, especially the conflict iterator. Lookup
would be O(1), too - rather than O(log(n)) right now.

However, the memory cost could be high. Using some codebases we compile
with subzero - so a similar in spirit scenario (C code - if you squint
and think of asm.js as really "C"), we saw programs with 100K IR
instructions coming into the allocator. Granted, different IR language,
but I feel we could use it as a reasonable hint. So for that scenario,
since we here have 4 positions per instruction, it'd mean ~400K per
register, so then a few megs just for this structure.

That worried me a bit. I didn't continue with the analysis - there is a
cutoff point after which a set may also become more expensive than a
dumb matrix. My priority still is getting the codegen to a good place
where it shows the value of the algorithm, so I postponed these
decisions, and I think the APIs through which the rest of the allocator
uses this storage are abstract enough for me to swap implementations
later and analyze this area better.

The alternative I had with the initial checkin before was inspired by
Mozilla and turned out to be *really* bad. Mozilla uses a splay tree,
but then they chance on there being no more than 1 or 2 conflicts per
candidate live range. If that's not the case, the range loses
automatically. Seemed suboptimal to me, and also a choice that makes
understanding the effects of the algorithm on large programs hard - you
can't rely on "best choice wins", it is rather "best choice, or
sometimes not, if more than a few worse choices are conflicting" :)
Using that implementation choice, compile time for programs like zlib
ballooned to over 5 minutes (!!) from a few hundred ms.

I was planning actually to talk to the author (on the Mozilla side), at
our last social, to get a better understanding of the tradeoffs he found
- but he had a conflict, so next time.

I haven't analyzed much LLVM's, they call it a Matrix though,
interestingly enough.

At the end of the day, with this structure, I get a compile time in the
neighborhood of what linear does - for complicated programs (like zlib),
the difference is noticeable (100 vs 148 ms spent in the turbofan
pipeline, with regalloc alone being 3x more expensive - from 18ms to
66ms), but it's not 5 minutes, and at this stage, all I care about is
codegen quality - gathering data to allow us to analyse the value of
this allocator, for a scenario that's mainly AoT compilation. Not saying
compile time doesn't matter - just prioritizing it after we feel happy
enough with the codegen.

Sorry for the long rant :) But if you have suggestions for improvements
here, let's chat, but I'd prefer separating that from this particular CL
- I need to land this to unblock more changes in the code quality side
of things - but this structure, I feel, will be the battleground for
faster compile time, so I care about it quite a lot.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode924
src/compiler/register-allocator.h:924: struct Comparer : public
std::binary_function<AllocatedInterval,
On 2015/06/11 05:30:30, Benedikt Meurer wrote:
Why do you need this Comparer class? Can't you just define operator<
on
AllocatedInterval directly?

Oh, I see - because std::less would then just use that? Sure - this way
though I could have other intrinsic comparers to AllocatedInterval,
while in the scenario of the storage, I want this one. I admit, that's
all hypothetical, though. At the same time, I see no inefficiency with
the current design, and, subjectively, I find it clearer.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode937
src/compiler/register-allocator.h:937: // Iterator over allocated live
ranges (CoalescedLiveRanges) that conflict with
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
This whole iterator looks quite complex. Maybe we should have a better
data
structure that does not require such a complex iterator instead?

Possibly, please see long rant above :)

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode941
src/compiler/register-allocator.h:941: friend class CoalescedLiveRanges;
On 2015/06/11 05:30:30, Benedikt Meurer wrote:
friend classes should be listed in the private section.

Done.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode947
src/compiler/register-allocator.h:947: LiveRange* operator*() { return
pos_->range; }
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
This should return a LiveRange&, not a LiveRange*.

The element of the set if "LiveRange*", so dereferencing the iterator
produces a value typed as that. It's akin to how for a
std::set<LiveRange*>, dereferencing the iterator also returns
LiveRange*.

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode950
src/compiler/register-allocator.h:950: bool IsEnd() const {
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
STL iterators compare against some end. Don't try to add a wrapper for
that,
which might get out of sync.

I'm not sure how it could it get out of sync (and what it's sync-ed to).

The trade-off here is: similarity with reusable collections, vs some
code simplicity savings in a few places. If we removed IsEnd, we'd have
to carry around the "right" end, however all scenarios we're using this
iterator use it in a very constrained manner - no mutations for example.
But, if we feel similarity with STL trumps that, happy to change it
accordingly (removing IsEnd). I went back and forth on this one, so any
nudge will do :)

https://codereview.chromium.org/1157663007/diff/120001/src/compiler/register-allocator.h#newcode989
src/compiler/register-allocator.h:989: Invalidate();
On 2015/06/11 05:30:29, Benedikt Meurer wrote:
Initialize properly instead of calling Invalidate in the constructor.

Not sure what you mean by "proper" initialization, but the goal here is
to have an "end" constructor for a given storage. I'm using "Invalidate"
as a canonical way of achieving that. What trade-offs am I missing?

https://codereview.chromium.org/1157663007/diff/120001/src/disassembler.cc
File src/disassembler.cc (right):

https://codereview.chromium.org/1157663007/diff/120001/src/disassembler.cc#newcode297
src/disassembler.cc:297: }
On 2015/06/11 05:30:30, Benedikt Meurer wrote:
Please undo this change, especially since this file is otherwise
unchanged.

This (well, undoing it) makes git cl format complain, actually... I'll
figure out the git gymnastics to get out of that.

https://codereview.chromium.org/1157663007/

--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to