Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-24 Thread jonas . hahnfeld
On 2020/04/24 08:15:09, Valentin Villenave wrote:
> On 2020/04/24 07:20:55, hahnjo wrote:
> > If there are no technical objections, I think we should move
> > forward and let something as big sit around for too long.
> 
> Just for clarification, did you miss a "not" in that sentence?

Ehm yes. Probably past me couldn't decide where to put it...

If there are no technical objections, I think we should move
forward and not let something as big sit around for too long.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-24 Thread v . villenave
On 2020/04/24 07:20:55, hahnjo wrote:
> If there are no technical objections, I think we should move
> forward and let something as big sit around for too long.

Just for clarification, did you miss a "not" in that sentence?

Cheers,
  - V.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-24 Thread jonas . hahnfeld
So I've largely kept out of the discussion for now, mostly because I'm
not overly familiar with the code. However I fully agree with Dan that a
macro pretending to be a member function is neither obvious nor C++
style. As such I'm in favor of doing the conversion as it better meets
the expectations of average developers (at least mine).

I also agree with David that this patch doesn't prescribe any future
direction, which seems to be Han-Wen's fear. Any possible follow-up is
subject to the same review process. If there are no technical
objections, I think we should move forward and let something as big sit
around for too long.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-13 Thread Carl . D . Sorensen
I am in favor of this patch because of the following:

1) David K. has a long history of improving things through changes like
this.
2) It does not change the user interface or any files that a user
accesses
3) David has done the work of making all the code changes this syntax
change requires

Carl

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-13 Thread dak
On 2020/04/13 23:07:57, Dan Eble wrote:
> This change per se LGTM.  I remember discussing this syntactic change
briefly on
> the list a few(?) months ago, so this is not surprising.
> 
> I'm quite pleased with this change, actually.  I remember how I felt
the first
> time I came across klass->a_macro_actually(...) and couldn't find the
method in
> klass.hh.  I'm glad future contributors will not have to repeat that
experience.

It's embarrassing to admit, but while I did voice something akin to
making macros imitate member function call syntax, I actually forgot
that I had exactly the same kind of somewhat time-consuming double take
when trying to track down that admittedly clever code.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-13 Thread nine . fierce . ballads
This change per se LGTM.  I remember discussing this syntactic change
briefly on the list a few(?) months ago, so this is not surprising.

I'm quite pleased with this change, actually.  I remember how I felt the
first time I came across klass->a_macro_actually(...) and couldn't find
the method in klass.hh.  I'm glad future contributors will not have to
repeat that experience.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-12 Thread dak
On 2020/04/12 10:35:24, hanwenn wrote:
> On Sat, Apr 11, 2020 at 3:55 PM David Kastrup 
wrote:
> > >> You haven't understood the scheme.  A grob only contains the
properties
> > >> seminal to its type.  That's why there is a double indirection. 
A full
> > >> array of 416 entries is needed per grob _type_, not per grob.
> > >
> > > I still don't see why this requires an overall overhaul of the
> > > internal calling convention. You can have a global registry of
symbol
> > > => uint16 (numbering all the symbols),
> >
> > It would be pointless to have a single registry for music
properties,
> > grob properties, stream event properties, and context properties,
thus
> > significantly blowing up the second-level indexing tables and making
for
> > a large number of cache misses.
> 
> there are 400 grob properties, and about 800 properties in total
> (including music and context). 400 requires 9 bits, 800 requires 10
> bits. It doesn't appreciably change the problem size.

Populating arrays densely or sparsely makes a large difference in how
well they fit caches.  However, this is completely irrelevant since this
patch does not, I repeat _not_ enforce any particular implementation. 
All that it does is provide the macros with more complete information
and thus enables implementations that are not crippled down by having
macros pretend to be able to deal with overload resolution.

That you don't feel that you are able to make use of that change does
not mean that you are the only developer who should be allowed to work
on such problems and consequently should be able to block off any
attempt of restructuring the source code in a manner that allows other
developers to more easily experiment with other approaches without
having to constantly resolve rebase conflicts because a design decision
unsuitable for refactoring is kept in place for no good reason.

You'll find that we have had several restructurings of macro call
locations in the past years in order to facilitate refactorings, some
with more, some with less success and permanent impact.  In general, the
developers proposing them had reasonable expectations for the directions
those changes would open for them, and reasonable consideration for the
impact on others.  Please keep in mind that LilyPond, as it stands now,
is a colloborative project.  Accommodating the work of others to
proceed, given reasonable goals and impact, is therefore a necessity.  A
stance that amounts to "I cannot make that work, so we shouldn't move in
a direction where you could try" is not really useful for a
colloborative effort of developers with equal standing.

I had several other proposals of syntax that would provide required
information to the macros in case, but this is the least intrusive one
that retains some flavor of overloading.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-12 Thread Han-Wen Nienhuys
On Sat, Apr 11, 2020 at 3:55 PM David Kastrup  wrote:
> >> You haven't understood the scheme.  A grob only contains the properties
> >> seminal to its type.  That's why there is a double indirection.  A full
> >> array of 416 entries is needed per grob _type_, not per grob.
> >
> > I still don't see why this requires an overall overhaul of the
> > internal calling convention. You can have a global registry of symbol
> > => uint16 (numbering all the symbols),
>
> It would be pointless to have a single registry for music properties,
> grob properties, stream event properties, and context properties, thus
> significantly blowing up the second-level indexing tables and making for
> a large number of cache misses.

there are 400 grob properties, and about 800 properties in total
(including music and context). 400 requires 9 bits, 800 requires 10
bits. It doesn't appreciably change the problem size.

> > memoize that uint16 like we memoize the symbol today (I think this is
> > what you suggested already).  Then use a uint16 => value hashtable in
> > the grob/prob/etc. A custom hash implementation can save space by
> > storing the uint16 key rather than the 8 byte SCM symbol.
> >
> > This would be a very conservative, localized change that doesn't
> > require setting up machinery to convert types to vectors,
>
> A vector is cheaper to index than a hashtable.  It makes no sense
> talking aobut "setting up machinery" with regard to a vector while
> taking hash tables for granted.
>
> > and it keeps working if grob types change which grob-interfaces they
> > support halfway through.
>
> I don't think there is any point in discussing this patch based on what
> you surmise about an ultimate solution.


Have a look at https://github.com/hanwen/lilypond/tree/grob-dict

It explores different proposals, all of them with neglible to negative
performance impact.

The last series introduces a 'symid' (a global, memoized uint16
identifier for a property), with the optimized hashtable.

I think you could implement what you want to do try by doing the symid
=> per grob type vector index lookup in Grob::internal_xx_property. I
still don't see what you'd do with the infromation you gain by
reconstructing the call sites. Even if you know the C++ type (Grob vs
Context) in the call site, you still won't know the actual grob type
(Stem vs. NoteHead) in the call sites, which you'd need to do further
memoization.

I looked a bit closer at your call-site reorganization. It introduces
'this' as parameter in many places, which I find ugly, and certainly
is nonstandard idiom in C++. I suggest we proceed only if you can
produce a measurable performance improvement. If that happens, I'd be
happy to help deal with any merge conflicts. Until that time, you can
work on the change by not rebasing it.

> >> Hashing outside of a closed set still requires storing the actual symbol
> >> for corroboration.  Hashing inside of a closed set amounts to the
> >> first-stage lookup in my scheme, except that a small number guaranteed
> >> to be unique comes out.  This lookup can be memoised and will work
> >> permanently as opposed to hashing in a growing set.
> >>
> >> > We could experiment with this without requiring a giant rewrite of the
> >> > source code.
> >>
> >> What you call a "giant rewrite" is basically this patch.  It's a purely
> >> syntactical patch only extending rather than reducing possible
> >> approaches and perfectly compatible with different implementations.  The
> >> work for that patch is already done, the impact is similar to what
> >> happened when changing around the syntax of unsmob a few times ian the
> >> past.  It will allow for prolonged work on possibly different
> >> implementations without having to constantly deal with merge conflicts.
> >>
> >> It does not depend on picking a particular implementation.  You
> >> expressed interest in what I was planning to do with the opportunity
> >> this patch opens, I answered.  I do not, however, see the point in
> >> discussing and dissing (partially) unwritten code based on an
> >> incomplete understanding in a handwaving discussion.
> >
> > I think it is not out of the ordinary to discuss plans invasive plans
> > before implementing them.
>
> But we are not discussing an invasive plan here.  This here is an
> extensive, not an invasive change.  There is no point in dissing
> unwritten code that take no project-wide resources.  The point of time
> to review code is after it is proposed.
>
> > I think this change is invasive because it changes the idiom across
> > the source code, and I disagree that this change is net neutral. C++
> > method calls are a more natural idiom plain function calls.
>
> We are not talking about C++ method calls here.  We are talking about
> macro calls.  Memoisation of any kind can only be implemented at the
> macro call level when diversifying on a string literal since those are
> not available as template arguments, otherwise one could transfer the
> 

Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-11 Thread David Kastrup
Han-Wen Nienhuys  writes:

> On Fri, Apr 10, 2020 at 1:07 PM David Kastrup  wrote:
>> > * memory use: each SCM value takes 8 bytes, and there are 416 grob
>> > properties today, for a total of 3328 bytes. Morgenlied is single page
>> > of music and has 3704 grobs. So storage for the vectors (which will be
>> > mostly empty) will take up 12M / page. This likely will result in
>> > doubling the memory use of LilyPond.
>>
>> You haven't understood the scheme.  A grob only contains the properties
>> seminal to its type.  That's why there is a double indirection.  A full
>> array of 416 entries is needed per grob _type_, not per grob.
>
> I still don't see why this requires an overall overhaul of the
> internal calling convention. You can have a global registry of symbol
> => uint16 (numbering all the symbols),

It would be pointless to have a single registry for music properties,
grob properties, stream event properties, and context properties, thus
significantly blowing up the second-level indexing tables and making for
a large number of cache misses.

> memoize that uint16 like we memoize the symbol today (I think this is
> what you suggested already).  Then use a uint16 => value hashtable in
> the grob/prob/etc. A custom hash implementation can save space by
> storing the uint16 key rather than the 8 byte SCM symbol.
>
> This would be a very conservative, localized change that doesn't
> require setting up machinery to convert types to vectors,

A vector is cheaper to index than a hashtable.  It makes no sense
talking aobut "setting up machinery" with regard to a vector while
taking hash tables for granted.

> and it keeps working if grob types change which grob-interfaces they
> support halfway through.

I don't think there is any point in discussing this patch based on what
you surmise about an ultimate solution.

> There are many types (stem, axis-group, beam, hairpin) that properties
> that are mostly unset. A fixed encoding will waste CPU cache and main
> memory to allow these properties to be set even if it is not
> necessary, so aside from increased complexity, it's not obvious that
> there will be a performance gain.

We can discuss this until the cows go home, but the authoritive answer
will be benchmarks.  They may indicate the necessity for further work,
and doing further work is facilitated by not having to track any added
C++ code accessing properties.  Hence this patch.

I actually expect more gains to be realised by rewriting
break-substitution.cc rather than the actual property access, but having
property access made cheap is certainly not undesirable either.

>> Hashing outside of a closed set still requires storing the actual symbol
>> for corroboration.  Hashing inside of a closed set amounts to the
>> first-stage lookup in my scheme, except that a small number guaranteed
>> to be unique comes out.  This lookup can be memoised and will work
>> permanently as opposed to hashing in a growing set.
>>
>> > We could experiment with this without requiring a giant rewrite of the
>> > source code.
>>
>> What you call a "giant rewrite" is basically this patch.  It's a purely
>> syntactical patch only extending rather than reducing possible
>> approaches and perfectly compatible with different implementations.  The
>> work for that patch is already done, the impact is similar to what
>> happened when changing around the syntax of unsmob a few times ian the
>> past.  It will allow for prolonged work on possibly different
>> implementations without having to constantly deal with merge conflicts.
>>
>> It does not depend on picking a particular implementation.  You
>> expressed interest in what I was planning to do with the opportunity
>> this patch opens, I answered.  I do not, however, see the point in
>> discussing and dissing (partially) unwritten code based on an
>> incomplete understanding in a handwaving discussion.
>
> I think it is not out of the ordinary to discuss plans invasive plans
> before implementing them.

But we are not discussing an invasive plan here.  This here is an
extensive, not an invasive change.  There is no point in dissing
unwritten code that take no project-wide resources.  The point of time
to review code is after it is proposed.

> I think this change is invasive because it changes the idiom across
> the source code, and I disagree that this change is net neutral. C++
> method calls are a more natural idiom plain function calls.

We are not talking about C++ method calls here.  We are talking about
macro calls.  Memoisation of any kind can only be implemented at the
macro call level when diversifying on a string literal since those are
not available as template arguments, otherwise one could transfer the
job to template specialisation like it is done in a few other places.
So the macro needs information to work with.  This change makes what
actually happens more rather than less transparent and idiomatic.

> If we can get a significant performance boost, that may be 

Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-11 Thread Han-Wen Nienhuys
On Sat, Apr 11, 2020 at 2:53 PM Jonas Hahnfeld  wrote:
>
> Am Samstag, den 11.04.2020, 14:45 +0200 schrieb Han-Wen Nienhuys:
> > I think it is not out of the ordinary to discuss plans invasive plans
> > before implementing them.
>
> To be fair, David did ask about it two months ago:
> https://lists.gnu.org/archive/html/lilypond-devel/2020-02/msg00668.html
>

I did ask " I'm curious about your plans. Can you say more? "

My question was smushed into the quote in the archive, maybe because
of the vagaries of HTML vs. Text mail.

https://lists.gnu.org/archive/html/lilypond-devel/2020-02/msg00705.html

> (I don't understand this part of LilyPond, so I'll keep out of the
> technical discussion.)

--
Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-11 Thread Jonas Hahnfeld
Am Samstag, den 11.04.2020, 14:45 +0200 schrieb Han-Wen Nienhuys:
> I think it is not out of the ordinary to discuss plans invasive plans
> before implementing them.

To be fair, David did ask about it two months ago:
https://lists.gnu.org/archive/html/lilypond-devel/2020-02/msg00668.html

(I don't understand this part of LilyPond, so I'll keep out of the
technical discussion.)


signature.asc
Description: This is a digitally signed message part


Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-11 Thread Han-Wen Nienhuys
On Sat, Apr 11, 2020 at 2:45 PM Han-Wen Nienhuys  wrote:
> > What you call a "giant rewrite" is basically this patch.  It's a purely
> > syntactical patch only extending rather than reducing possible
> > approaches and perfectly compatible with different implementations.  The
> > work for that patch is already done, the impact is similar to what
> > happened when changing around the syntax of unsmob a few times ian the
> > past.  It will allow for prolonged work on possibly different
> > implementations without having to constantly deal with merge conflicts.
> >
> > It does not depend on picking a particular implementation.  You
> > expressed interest in what I was planning to do with the opportunity
> > this patch opens, I answered.  I do not, however, see the point in
> > discussing and dissing (partially) unwritten code based on an incomplete
> > understanding in a handwaving discussion.
>
> I think it is not out of the ordinary to discuss plans invasive plans
> before implementing them.  I think this change is invasive because it
> changes the idiom across the source code, and I disagree that this
> change is net neutral. C++ method calls are a more natural idiom plain
> function calls. If we can get a significant performance boost, that
> may be worth it, but I think we should first understand if that is the
> case.  As said (see above), I am skeptical.

Also, property access accounts for about 10% of runtime, last time I
looked. This can probably be reduced by some factor, but even if it
were infinitely fast, it is not going to make a huge difference to the
overall runtime.

-- 
Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-11 Thread Han-Wen Nienhuys
On Fri, Apr 10, 2020 at 1:07 PM David Kastrup  wrote:
> > * memory use: each SCM value takes 8 bytes, and there are 416 grob
> > properties today, for a total of 3328 bytes. Morgenlied is single page
> > of music and has 3704 grobs. So storage for the vectors (which will be
> > mostly empty) will take up 12M / page. This likely will result in
> > doubling the memory use of LilyPond.
>
> You haven't understood the scheme.  A grob only contains the properties
> seminal to its type.  That's why there is a double indirection.  A full
> array of 416 entries is needed per grob _type_, not per grob.

I still don't see why this requires an overall overhaul of the
internal calling convention. You can have a global registry of symbol
=> uint16 (numbering all the symbols), memoize that uint16 like we
memoize the symbol today (I think this is what you suggested already).
Then use a uint16 => value hashtable in the grob/prob/etc. A custom
hash implementation can save space by storing the uint16 key rather
than the 8 byte SCM symbol.

This would be a very conservative, localized change that doesn't
require setting up machinery to convert types to vectors, and it keeps
working if grob types change which grob-interfaces they support
halfway through.

There are many types (stem, axis-group, beam, hairpin) that properties
that are mostly unset. A fixed encoding will waste CPU cache and main
memory to allow these properties to be set even if it is not
necessary, so aside from increased complexity, it's not obvious that
there will be a performance gain.

> Hashing outside of a closed set still requires storing the actual symbol
> for corroboration.  Hashing inside of a closed set amounts to the
> first-stage lookup in my scheme, except that a small number guaranteed
> to be unique comes out.  This lookup can be memoised and will work
> permanently as opposed to hashing in a growing set.
>
> > We could experiment with this without requiring a giant rewrite of the
> > source code.
>
> What you call a "giant rewrite" is basically this patch.  It's a purely
> syntactical patch only extending rather than reducing possible
> approaches and perfectly compatible with different implementations.  The
> work for that patch is already done, the impact is similar to what
> happened when changing around the syntax of unsmob a few times ian the
> past.  It will allow for prolonged work on possibly different
> implementations without having to constantly deal with merge conflicts.
>
> It does not depend on picking a particular implementation.  You
> expressed interest in what I was planning to do with the opportunity
> this patch opens, I answered.  I do not, however, see the point in
> discussing and dissing (partially) unwritten code based on an incomplete
> understanding in a handwaving discussion.

I think it is not out of the ordinary to discuss plans invasive plans
before implementing them.  I think this change is invasive because it
changes the idiom across the source code, and I disagree that this
change is net neutral. C++ method calls are a more natural idiom plain
function calls. If we can get a significant performance boost, that
may be worth it, but I think we should first understand if that is the
case.  As said (see above), I am skeptical.

> It is more efficient to discuss objections of the "you haven't thought
> this through" kind on working code.
>
> Nothing in this patch here enforces picking a particular route.  It is
> just paving the ground for more work, and it does so in a manner that
> isn't so much of an eye sore that it can only be tolerated in
> expectation of immense future gains.

-- 
Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-10 Thread David Kastrup
Han-Wen Nienhuys  writes:

> On Thu, Apr 9, 2020 at 7:45 PM  wrote:
>>
>> On 2020/04/09 17:07:46, hanwenn wrote:
>> > I'm curious about these optimizations. Can you say more?
>>
>> Properties are currently stored in alists.  They can be stored in
>> vectors instead.  Give all grob properties a number, then have an array
>> per grob type mapping this number to an index into an array.  The first
>> number can be memoized for literal property names, turning a property
>> lookup into two indexing operations.
>>
>> That's the gist.  For Scheme code, the memoization could be replaced by
>> macro compilation, but Guilev2 byte compilation of course is sort of a
>> roadblock here, too.
>
> I think it is a great idea to store properties in vectors rather than
> alists. They take up less space, and lookups are faster.
>
> Your specific approach sounds very laborious, and I see many
> downsides, for example:
>
> * this will make it hard to add new grob-properties at runtime

It won't.  The likely long-term restriction will be that you cannot set
new properties on items (grobs/music/stream events) that existed already
at the time the new property is being defined.  Our current way of
"adding new grob-properties at runtime" is poking around in undocumented
internals in a manner that bleeds into unrelated source files.  It is
possible to set up a backward-compatibility scheme for this temporarily
(basically, before an error gets thrown, the respective
object-properties are checked, and a proper registration of the property
is performed in case those properties have been set by the user, also
meaning that they can be prevented from bleeding into the next session
as long as they are used at least once).

Adding a proper interface for registering user-added properties in a
documented and dependable manner (rather than poking around in the
internals of what happens to be the current implementation) is a
semi-unrelated project that would be independently desirable.

> * memory use: each SCM value takes 8 bytes, and there are 416 grob
> properties today, for a total of 3328 bytes. Morgenlied is single page
> of music and has 3704 grobs. So storage for the vectors (which will be
> mostly empty) will take up 12M / page. This likely will result in
> doubling the memory use of LilyPond.

You haven't understood the scheme.  A grob only contains the properties
seminal to its type.  That's why there is a double indirection.  A full
array of 416 entries is needed per grob _type_, not per grob.

> * Let's not put another roadblock in front of GUILE v2 adoption.

The Scheme angle can be tackled differently and is not part of the first
phase.

> Have you considered building a custom hash table for the properties?

I did.

> We could construct a table based on probing (so we don't need buckets
> which would reintroduce linked lists). Since symbols have unique
> addresses, we could do the hashing by casting SCM to uintptr and
> hashing a uint64 down to the table size. This means we'll still do the
> uint64 -> index on each lookup, but it's likely much faster than
> walking alists, and would cut memory for properties in half.

Hashing outside of a closed set still requires storing the actual symbol
for corroboration.  Hashing inside of a closed set amounts to the
first-stage lookup in my scheme, except that a small number guaranteed
to be unique comes out.  This lookup can be memoised and will work
permanently as opposed to hashing in a growing set.

> We could experiment with this without requiring a giant rewrite of the
> source code.

What you call a "giant rewrite" is basically this patch.  It's a purely
syntactical patch only extending rather than reducing possible
approaches and perfectly compatible with different implementations.  The
work for that patch is already done, the impact is similar to what
happened when changing around the syntax of unsmob a few times in the
past.  It will allow for prolonged work on possibly different
implementations without having to constantly deal with merge conflicts.

It does not depend on picking a particular implementation.  You
expressed interest in what I was planning to do with the opportunity
this patch opens, I answered.  I do not, however, see the point in
discussing and dissing (partially) unwritten code based on an incomplete
understanding in a handwaving discussion.

It is more efficient to discuss objections of the "you haven't thought
this through" kind on working code.

Nothing in this patch here enforces picking a particular route.  It is
just paving the ground for more work, and it does so in a manner that
isn't so much of an eye sore that it can only be tolerated in
expectation of immense future gains.

-- 
David Kastrup
My replies have a tendency to cause friction.  To help mitigating
damage, feel free to forward problematic posts to me adding a subject
like "timeout 1d" (for a suggested timeout of 1 day) or "offensive".



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-10 Thread Han-Wen Nienhuys
On Thu, Apr 9, 2020 at 7:45 PM  wrote:
>
> On 2020/04/09 17:07:46, hanwenn wrote:
> > I'm curious about these optimizations. Can you say more?
>
> Properties are currently stored in alists.  They can be stored in
> vectors instead.  Give all grob properties a number, then have an array
> per grob type mapping this number to an index into an array.  The first
> number can be memoized for literal property names, turning a property
> lookup into two indexing operations.
>
> That's the gist.  For Scheme code, the memoization could be replaced by
> macro compilation, but Guilev2 byte compilation of course is sort of a
> roadblock here, too.

I think it is a great idea to store properties in vectors rather than
alists. They take up less space, and lookups are faster.

Your specific approach sounds very laborious, and I see many
downsides, for example:

* this will make it hard to add new grob-properties at runtime

* memory use: each SCM value takes 8 bytes, and there are 416 grob
properties today, for a total of 3328 bytes. Morgenlied is single page
of music and has 3704 grobs. So storage for the vectors (which will be
mostly empty) will take up 12M / page. This likely will result in
doubling the memory use of LilyPond.

* Let's not put another roadblock in front of GUILE v2 adoption.

Have you considered building a custom hash table for the properties?
We could construct a table based on probing (so we don't need buckets
which would reintroduce linked lists). Since symbols have unique
addresses, we could do the hashing by casting SCM to uintptr and
hashing a uint64 down to the table size. This means we'll still do the
uint64 -> index on each lookup, but it's likely much faster than
walking alists, and would cut memory for properties in half.

We could experiment with this without requiring a giant rewrite of the
source code.

-- 
Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-09 Thread dak
On 2020/04/09 17:07:46, hanwenn wrote:
> I'm curious about these optimizations. Can you say more?

Properties are currently stored in alists.  They can be stored in
vectors instead.  Give all grob properties a number, then have an array
per grob type mapping this number to an index into an array.  The first
number can be memoized for literal property names, turning a property
lookup into two indexing operations.

That's the gist.  For Scheme code, the memoization could be replaced by
macro compilation, but Guilev2 byte compilation of course is sort of a
roadblock here, too.

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-09 Thread hanwenn
I'm curious about these optimizations. Can you say more?

https://codereview.appspot.com/573670043/



Re: Refactor get/set_property to take the item as first argument (issue 573670043 by d...@gnu.org)

2020-04-09 Thread hanwenn
I'm curious about these optimizations. Can you say more?

https://codereview.appspot.com/573670043/