[perl #111848] [BUG] The storage strategy for arrays is weird in Rakudo

2012-03-19 Thread Carl Mäsak
# New Ticket Created by  Carl Mäsak 
# Please include the string:  [perl #111848]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=111848 


masak r: my $t = now; my @a; @a[ $_ ] = $_ for 1 .. 1e4; print now - $t
p6eval rakudo 1968b8: OUTPUT«2.55866416794898»
masak r: my $t = now; my %h; %h{ $_ } = $_ for 1 .. 1e4; print now - $t
p6eval rakudo 1968b8: OUTPUT«1.51738952909994»
masak in Rakudo, hashes are faster than arrays :)
masak in re http://perlmonks.org/?node_id=960334
moritz masak: yes, it pays the price of laziness
moritz (it = Arrays)
masak moritz: even when, as here, I'm not using that laziness at all.
moritz masak: correct
masak what is the additional cost here, more exactly?
moritz well, for each array access we need to check if the number of
reified elements is big enough
moritz though that should be rather cheap-ish
moritz I kind think we're doing something very wrong when it comes
to arrays and lists
moritz nom: my @a = 1, 2, 3; say @a.DUMP
p6eval rakudo 1968b8: OUTPUT«Array-414978996074906(:items(Mu),
:nextiter(ListIter-414978996073111(:reified(▶Mu),
:rest(RPA-414978996040334(Array-414978996076959(:items(RPA-414978996035975(▶1,
▶2, ▶3)), :nextiter(▶Mu, :list(Array-414978996074906␤»…
moritz there you see what's wrong
moritz it's not the outer-most array that holds all the reified elements
masak right.
moritz they are stored in two layers of nextiter
masak I almost feel tempted to submit a rakudobug about that.
moritz please do
* masak submits rakudobug

Submitting bug reports about suboptimality is risky business. So
here's something concrete and actionable: remove the double nextiter
cruft from the above .DUMP output. Bonus points -- but not required to
close this ticket -- if you can get the benchmark above to run faster
too. Ideally, arrays should be faster than hashes.


[perl #111848] [BUG] The storage strategy for arrays is weird in Rakudo

2012-03-19 Thread Patrick R. Michaud via RT
The speed of Array vs. Hash element access is partially addressed by 
commit c10792f8.  Before this commit, each Array element access via 
.at_pos($pos) resulted in an expensive call to self.exists($pos) for 
each access, whereas Hash element accesses via .at_key($key) were 
able to use nqp::existskey() directly.  This patch uses nqp::existspos() 
to hotpath the detection of existing elements and avoids calls to 
self.exists($pos) when the Array is already fully reified.  

For the benchmark given in RT #111848, this resulted in a ~25%
speedup for array element accesses, and brings it to within 5% of Hash
element access times.  (At present Array element accesses still have
more overhead at the PIR level than Hash element accesses due to
laziness considerations and boundary checks.

I'm still looking into the two layers of nextiter part; I agree
it's somewhat surprising but I'm not convinced it's suboptimal
within the current context of how we handle/store constant lists.
For the array assignment Rakudo has to do iteration at some point in
order to bring the elements into the top-level items part; it's
simply choosing to do it at the time of Array element access instead
of at the point of the assignment.  

Given that the ticket refers to suboptimality, it's hard to know
when it should be closed.  If masak++ or someone else feels that
this reply adequately addresses the issues for now, I'm fine with
it being resolved now, or if we want to leave it open pending further
discussion/resolution of the two nextiter layer issue, I'm fine
with that also.

Pm