> On Thursday, July 4, 2019, 07:52:05 PM EDT, Christof Ressi
> wrote:
>> > how big of a list xcan we have in Pd?
>> 100
> That's only the internal limit for stack allocation in the [list] methods.
There's no easy way to measure the worst case performance for malloc/free, much
less predict the worst cases on other supported platforms of Pd.
So if low latency realtime scheduling is desired, then the most sane way to
achieve that is to limit the number of malloc/free calls to as few as possible
during DSP computation (or if it's not a DSP-based patch, then during the
normal runtime of the patch after all loadbangs have been fired).
That means always taking the stack allocation path for list classes. The way to
do that is to limit list output sizes to a maximum of 100 elements.
Also notice that any copying of lists requires a loop to copy each element of
the list to the new list (regardless of how the target list got allocated). So
for maximally predictable runtime performance, try not to use any list objects
that require copying lists. (This is why the crazy recursive list-abs/list-drip
abstraction is performant.)
>> > How many elements can a t_atom have?
>> 1000
> What does this even mean?
It means evaluating a Pd message with greater than 1000 elements in a row which
are neither commas or semis will trigger a malloc/free pair each time the
message gets evaluated. So for low latency realtime performance scheduling,
don't do that during normal patch runtime.
>> For best results, no bigger than:
>> 5
> Everything below 100 elements is equally fine (no heap allocation)
Five elems and below avoids an alloca call by just assigning to a 5 element
array "smallstack" in binbuf_eval. I haven't measured so I have no idea if
that's a premature optimization, though.
Also note: binbuf_eval already has an #ifdef for simply allocating for the
entire message size, and the comment mentions decreased performance for
whatever tests were run. I'm sure it would be worse performance for loading
patches as those would almost be the worst possible case: large number of args
where semis break the message up into fairly regular small chunks that would
typically fit into an alloca or even a smallstack (e.g., "#X obj 20 20 bang;").
Random digression-- why doesn't t_binbuf just cache some annotations about
itself like
* count for its longest nested message (e.g., contiguous region with no semis
or commas* gpointer count* whether its a constant message (e.g., do we have
dollar signs)
It appears every way to populate a t_binbuf already loops through the elements,
*and* that they allocate memory there. Seems like users would see a performance
increase by caching at least the maxnargs in the t_binbuf itself to avoid a
full loop through the binbuf in binbuf_eval.
Constant message annotation could be nice, too. If a message box was just [1 2
3 4 5( then the message could just do an end run around the message evaluation.
(Of course that would add a branch for every message that isn't constant, so
it'd need to be measured...)
-Jonathan ___
Pd-list@lists.iem.at mailing list
UNSUBSCRIBE and account-management ->
https://lists.puredata.info/listinfo/pd-list