On Tuesday, December 3, 2013 3:07:35 PM UTC-8, Nels Nelson wrote:
> On Tuesday, December 3, 2013 2:38:35 PM UTC-6, Jeremy Evans wrote:
>>
>> On Mon, Dec 2, 2013 at 9:00 PM, Nels Nelson wrote:
>>
>>> Thank you! I completely failed to find that in the docs. I'm sure it's
>>> there, and I just missed it. It is exactly ideal, and it works perfectly.
>>>
>>
>> Great!
>>
>
> Oops, I spoke too soon. So, the after_load method applies exclusively to
> the artist node's albums association after the album association loads, and
> is defined with a proc on the association definition. What I really need
> is something that runs after the artist node loads, and then iterates over
> the album association members for that node. Removing the
> after_initialize method and instead trying to use the after_load method
> on the album association resulted only in my custom code not getting
> invoked at all, resulting in the perceived performance improvement.
>
What's your reason you need this to be in after_initalize, instead of
called lazily when the association is loaded? Note that if you are eager
loading the association with the object, after_load will operate as if it
was called by after_initialize. In other words:
Bar.many_to_many :foos, :after_load=>proc{...}
Bar.eager(:foos).all{ #foos after load proc alread called once for each
bar}
class Artist < Sequel::Model
> plugin :rcte_tree
> plugin :caching, GloballyDefinedConcurrentHashMapInstance
> plugin :tactical_eager_loading
> many_to_many :albums,
> :class => Album,
> :join_table => :artists_to_albums,
> :left_key => :artist_id,
> :right_key => :album_id
>
> def after_initialize
> for influenced in self.children
> for album in influenced.albums
> something_special_with(album)
> end
> end
> end
> end
>
> Many Artists, may be associated to many Albums. Artists are nodes in a
> tree. Upon loading of an Artist node's children Artist nodes, I would like
> each of the children Artist nodes to have included each of their respective
> associated Albums so that they are available immediately, and without
> additional SQL queries.
>
You probably want to add this after plugin: rcte_tree:
one_to_many :children, :clone=>:children, :eager=>:albums
This makes sure that when you load the children association, the albums
association for the children is eagerly loaded.
I still can't detect a reason that you would want to do this during
after_initialize, unless something_special_with(album) mutates the
receiver. Is that the case?
>
> So when I do,
>
> jimi = Artist.where(:name => 'Jimi Hendrix')
> # Meanwhile, the after_initialize hook method is executed.
>
> (For the sake of this example, please ignore the fact that in the real
> world, artists may of course have more than one influencer.)
>
I should probably also ignore that Model#where returns a dataset, not a
model instance. :) I'm assuming you want to add a .first at the end there.
Assuming that's the case, this should be three queries:
1) Retrieve the first artist with the name Jimi Hendrix
2) Retrieve the children of that artist
3) Retrieve the albums for those children
> It would be nice if at most only two queries were executed here. Only one
> query would be even better, but I can never keep straight whether or not
> the children association can be loaded eagerly using rcte_tree -- let alone
> loaded eagerly while also eagerly loaded with their respective albums
> association members. I'm pretty sure the answer is no.
>
Technically, you could do it in two queries if the children association
uses :eager_graph=>:albums instead of :eager=>:albums (I think, I didn't
test that). I'm not sure it would perform better that way, though.
> Furthermore, it would be even nicer if after_initialize only ran one time
> ever, or until the model instances were marked as modified, or until the
> cached instances were explicitly cleared. Unfortunately, the only instance
> that winds up making it into the cache is the instance that gets assigned
> to jimi. And of course, none of jimi's albums make it into the cache
> either.
>
The caching plugin only populates the cache when you do a primary key
lookup, and only uses cache lookups by primary key, so based on the code
you've shown, I wouldn't use it. Roll your own caching:
def after_initialize
GloballyDefinedConcurrentHashMapInstance[pk] ||= children.each{|artist|
artist.each(&method(:something_special_with))}
end
So this doesn't cache the object, but caches the presumably expensive
calculation. Good luck with cache invalidation :)
I hope this helps clear things up. I'm having trouble being as clear as
> I'd like to be.
>
> I did try using the suggested
>
> one_to_many :children, :clone=>:children, :eager=>:albums
>
>
> on the Artist model definition, but not only does it not seem to achieve
> the desired performance improvements, it also causes some weird errors when
> attempting to access a count of the cloned Artist.children association. I
> can investigate more about that, if need be.
>
If you post some details about that, it would be helpful. Sequel 4.4.0
fixed a couple bugs related to association cloning, but it's possible there
are more. You could always manually recreate the association in the
meantime:
one_to_many :children, :key=>:parent_id, :class=>self, :eager=>:albums
>
>
> ... a given node's branch is programmatically pre-order traversed, and
>>> information retrieved. I may have to rework this, because it is bit of a
>>> performance pain point for the application.
>>>
>>
>> That sounds interesting. If you want to provide details, I may be able
>> to give advice in that area.
>>
>
> The algorithm is an adaptation from another program, and it is rather
> convoluted. Basically, it performs a pre-order traversal of a branch
> starting from a given node, and then does some complex book-keeping along
> the way, foregoing recursive pre-order traversals of certain nodes meeting
> specific criteria.
>
> The reason this routine is so slow is because some hundreds of queries and
> after_initialize hooks get executed against the datasource backend, even if
> there are only a couple dozen nodes in the branch.
>
I'm assuming that's due to things not being eagerly loaded currently.
> I'm pretty certain that if the loading and caching problem with the
> example above is solved, then this routine's performance will improve as
> well. Furthermore, if I were able to leverage the descendants dataset of
> the given branch node, and somehow managed to devise a filter based on the
> algorithm's criteria, then I imagine this bottleneck would be solved
> completely.
>
There's always:
jimi.descendants_dataset.where(...).all
> However, the associations with which I am concerned are not the children
>>> associations. They are just the normal many-to-many associations on the
>>> individual nodes.
>>>
>>
>> OK. I think the standard association :eager option or the
>> tactical_eager_loading plugin should help in those cases, modulo caching
>> issues.
>>
>
> I am having difficulty finding example of how to apply the :eager option
> to a many_to_many association definition. Does this involve defining and
> specifying a proc? Or is it as simple as
>
> many_to_many :albums, :eager => true
>
It would be something like:
many_to_many :albums, :eager=>:some_Album_association
This would result in loading albums for a given artist would additionally
preload an association for each of those albums (using a single query).
> Thanks again for your time, Jeremy. I hope that you are not the only one
> in this forum who ever replies to questions? :D
>
For most of the hard questions, I am. :) Many easy questions are answered
by the documentation, so I do tend to answer most of the questions myself,
though thankfully other contributors help out, especially when the question
is more ecosystem related than Sequel related.
Thanks,
Jeremy
--
You received this message because you are subscribed to the Google Groups
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/groups/opt_out.