On 01.01.2011 14:39, Levente Uzonyi wrote:
> On Sat, 1 Jan 2011, Philippe Marschall wrote:
> 
>> Hi
>>
>> I've been doing some performance work lately in Seaside. Long story
>> short Seaside (and I guess AIDA too) spends of if it's rendering time in
>> WriteStream (#nextPutAll:, #nextPut:).
>>
>> The way WriteStream >> #pastEndPut: behaves is not really ideal for
>> Seaside. It grows the underlying collection by just enough to
>> accommodate the argument collection (or 20 which ever is bigger). Now
> 
> No it doesn't. The code in #pastEndPut: is
> 
> collection := collection grownBy: ((collection size max: 20) min: 1000000).
> 
> so the argument of #grownBy: is at least 20, at most 1000000 and it's
> size of the internal buffer (collection) if it's between 20 and 1000000.
> So in most practical cases it's the value of [collection size].
> #grownBy: allocates a new collection that has the size of the
> collection's size + the argument, which is 2 * self size in the most
> common case. So the size grows exponentially till it reaches 1000000 and
> then linearly (with a relatively high constant 1000000). This guarantees
> that the cost of #nextPut: is constant in most cases.

You're right, I misread the code, that's the behaviour for #nextPut:.
However #nextPutAll: does:

        newEnd := position + aCollection size.
        newEnd > writeLimit ifTrue:
                [self growTo: newEnd + 10].

and #growTo: does then:

      newSize := anInteger + (oldSize // 4 max: 20).
        grownCollection := collection class new: newSize.


aCollection is the argument collection, not the underlying collection.
So that's the required size puls ten plus the max of a fourth of the
current size and 20.


>> image the following not very unlikely scenario. You start with a 4k
>> buffer and put on average a 10 element collection (remember all those
>> tags are put individually) until you have a 16k response. You allocate
>> more than a thousand intermediary collections to get there.
> 
> So just 3: one for 4k (the initial), one for 8k and one for 16k.

Yes.

>> What would be better suited for Seaside is doubling the required size.
> 
> That's exactly what's happening in most cases.

In the #nextPut: case, not in the #nextPutAll: case.

>> In the worst case that would mean wasting 50% of memory but it would
>> make the overhead of creating intermediary collections logarithmic. In
> 
> The overhead is constant (amortized) in most cases.
> 
>> the given example that would take us only three allocations to get there.
>> Now I do realize there are other applications for Pharo where this
>> strategy is not ideal and this is not a killer for us. I just wanted to
>> shed some light and this and ask whether other projects are in a similar
>> situation.
>>
>> To get a feel how allocation limited Seaside is: avoiding one allocation
>> of a 16k ByteArray per request can make a difference in throughput
>> between 10 Mbyte/s and 30 Mbyte/s (see "[Progress Report] Zinc HTTP
>> Components"). If anybody knows a way to make allocation of large young
>> space objects faster (Set GC Bias to Grow?, #vmParameterAt:put:?) I'd
>> like to hear it.
> 
> IIRC CogVM's GC is different from SqueakVM's GC in the sense that it is
> activated when a predefined amount of data is allocated, not when the
> number of allocations since the last GC reaches a predefined limit. So
> you have to tune the GC differently on Cog and non-Cog VMs.

Do you know where I can find more information about this?

Cheers
Philippe


Reply via email to