Another consideration of links is that links will be returned in the HTTP 
headers. If you have too many links, then you can blow away the max size limits 
on an HTTP header and bad things will happen.

There's MR functionality to link walk[1], but link walking may or may not be 
quick depending on your client library, links, phase of the moon, etc. Maybe 
one of the Ripple devs can comment on how things work over there.

What's the big picture problem that you're trying to solve? Are you try to 
determine the fastest way to traverse a rigid hierarchy in your data?

The sample MR query you provided looks like you're effectively trying to 
accomplish something like:

SELECT d.*
FROM countries co
JOIN states s ON co.country_id = s.country_id
JOIN cities ci on s.state_id = ci.state_id
JOIN streets st ON ci.city_id = st.city_id
JOIN devices d ON st.street_id = d.street_id
WHERE co.name = 'USA';

If that's the case, why not store the intermediate data as secondary indexes on 
the device itself? Then you can simply run a query to determine which devices 
are in the US rather than walk across multiple buckets. With sufficient 
secondary indexes at your intermediate levels, you should be able to easily 
recompute your various roll ups for reporting as the underlying data changes 
and still get quick reporting without having to traverse the existing buckets. 

[1]: http://basho.com/blog/technical/2010/02/24/link-walking-by-example/
--- 
Jeremiah Peschka - Managing Director, Brent Ozar PLF, LLC
Microsoft SQL Server MVP

On Sep 21, 2012, at 2:46 AM, Christian Dahlqvist <[email protected]> 
wrote:

> Hello Timo,
> 
> I recently played around with using secondary indexes instead of links, and 
> since I did not find any mapreduce functions that allowed me to follow 21 
> "links", I wrote a couple myself. These are available on my GitHub account 
> and I also submitted them to Basho Contrib yesterday for review.
> 
> Using secondary indexes to represent relationships will require the mapreduce 
> function to execute an index query, which will most likely be slower than 
> link walking, although it will as you stated be easier to maintain due to the 
> reduced number of records that need to be modified. As you seem to have a 
> quite deep hierarchy, maybe a mixture of links (for records and relationships 
> not changing often) and 2i "links" may work?
> 
> Best regards,
> 
> Christian 
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to