[appengine-java] Joins Concept in GQL Using Java
Does any body know how to implement joins in GQL -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/qDWjp365DSUJ. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
Thanks all. This is a lot of great information. I've learned a ton. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/49KEIocr7IcJ. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
Why isn't denormalization a real option? A lot of companies denormalize with great success, including Google. The thing about joins is this: they have to happen at some point in memory. Datastore or local instance. -- Ikai Lan Developer Programs Engineer, Google App Engine plus.ikailan.com | twitter.com/ikai On Thu, Aug 4, 2011 at 6:00 PM, William Levesque billleves...@gmail.comwrote: Alright, so I've spent a lot of time contemplating this whole BigTable isn't relational limitation. I've tried two techniques for joining different tables. The solution described here... http://gae-java-persistence.**blogspot.com/2010/03/** executing-simple-joins-across-**owned.html?showComment=**1298589845909#** c7562859098617623831http://gae-java-persistence.blogspot.com/2010/03/executing-simple-joins-across-owned.html?showComment=1298589845909#c7562859098617623831 and joining with loops inside my code. The former eats a lot of CPU, the latter is just silly. Denormalizing isn't a real option. There are very good reasons normalization was developed. So I'm trying to get a definitive strategy from Google that is considered the best way to support a system with complex data relationships. I appreciate your help. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/UCYoAoRaI6QJ. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
Because if you have denormalized data, then record updates can become enormous. If someones address is denormalized into 1000 contact records, then when the user updates their address the system has to go out to all of the contact records and update them as well. And this gets multiplied by every complex relationship that exists in the data. And redundant fields can increase data size exponentially. Regardless, denormalization is only one option. It just seems that Google should publish the guidelines for how to manage complex data relationships with clear guidance on advantages and disadvantages for each strategy. It's an important architectural consideration and we are currently left to hunt and peck around for what is even available let alone best practice for a given set of system requirements. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/k9TWBll6XqwJ. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
William, Could you explain how the update can be enormous with demoralized table? My understanding is a flat table is easier to update that normalized one. Thanks. On Aug 5, 2011 1:36 PM, Ikai Lan (Google) ika...@google.com wrote: Why isn't denormalization a real option? A lot of companies denormalize with great success, including Google. The thing about joins is this: they have to happen at some point in memory. Datastore or local instance. -- Ikai Lan Developer Programs Engineer, Google App Engine plus.ikailan.com | twitter.com/ikai On Thu, Aug 4, 2011 at 6:00 PM, William Levesque billleves...@gmail.com wrote: Alright, so I've spent a lot of time contemplating this whole BigTable isn't relational limitation. I've tried two techniques for joining different tables. The solution described here... http://gae-java-persistence.**blogspot.com/2010/03/** executing-simple-joins-across-**owned.html?showComment=**1298589845909#** c7562859098617623831 http://gae-java-persistence.blogspot.com/2010/03/executing-simple-joins-across-owned.html?showComment=1298589845909#c7562859098617623831 and joining with loops inside my code. The former eats a lot of CPU, the latter is just silly. Denormalizing isn't a real option. There are very good reasons normalization was developed. So I'm trying to get a definitive strategy from Google that is considered the best way to support a system with complex data relationships. I appreciate your help. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/UCYoAoRaI6QJ. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
I was trying to explain that with... If someones address is denormalized into 1000 contact records, then when the user updates their address the system has to go out to all of the contact records and update them as well. And this gets multiplied by every complex relationship that exists in the data. And redundant fields can increase data size exponentially. But is Google's position that all data should be denormalized? -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/W4INlksPkb0J. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
William, You might want to go over this http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/mapreduce-osdi04.pdf, and come back again with any questions. Ikai and possibly others were trying to convey to you that bigtable approach is more scalable than relational approach. If it works for Google, why woudn't it work for you? Do you have a larger data than Google? On Fri, Aug 5, 2011 at 2:45 PM, William Levesque billleves...@gmail.comwrote: I was trying to explain that with... If someones address is denormalized into 1000 contact records, then when the user updates their address the system has to go out to all of the contact records and update them as well. And this gets multiplied by every complex relationship that exists in the data. And redundant fields can increase data size exponentially. But is Google's position that all data should be denormalized? -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/W4INlksPkb0J. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
I didn't mean to suggest that. Yes, a fanout is potentially bad, but the problem with the normalized approach is that you equally optimize for reads and writes. In the address book example, I update my address book about 1 time every 3 years. I read my address book 20 times a day. I think it's fair to pay the cost of a fan out on change because a change is so infrequent. Normalization has its benefits: it's harder for things to get out of sync, which is ALWAYS a risk with denormalization. A denormalized solution tends to favor eventually consistency approaches over strongly consistent approaches. My point is that every app can be built in a denormalized approach, and in the majority of cases, you actually *want* to build your app in this approach, not the other way around. -- Ikai Lan Developer Programs Engineer, Google App Engine plus.ikailan.com | twitter.com/ikai On Fri, Aug 5, 2011 at 12:00 PM, JT jem...@gmail.com wrote: William, You might want to go over this http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/mapreduce-osdi04.pdf, and come back again with any questions. Ikai and possibly others were trying to convey to you that bigtable approach is more scalable than relational approach. If it works for Google, why woudn't it work for you? Do you have a larger data than Google? On Fri, Aug 5, 2011 at 2:45 PM, William Levesque billleves...@gmail.comwrote: I was trying to explain that with... If someones address is denormalized into 1000 contact records, then when the user updates their address the system has to go out to all of the contact records and update them as well. And this gets multiplied by every complex relationship that exists in the data. And redundant fields can increase data size exponentially. But is Google's position that all data should be denormalized? -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/W4INlksPkb0J. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
As far as managing complex data relationships, I don't think such a set of practices exists. What I can and should do (once I get some time) is publish some case studies about how we have persisted data in some cases. True, denormalizing data often requires you to think a little bit, but that's also why I like it: the school of normalization can often lead to people going down a checklist approach for figuring out persistence schemas which can often produce substandard structures. Here's an example of a recent internal app I built/am building: a trip planner. For each user, I store every trip that user takes in a serialized structure on that user. For each region, I store a serialized list of trips to that region. Whenever someone updates a trip, I have to update several structures, but the assumption is that it will be read-heavy instead of update heavy. The application is heavily denormalized and uses get-by-key as much as possible. This app could also very easily have been built using a normalized approach: trip table joins to regions joins to cities. users join to user_trips join to trips. The way to approach denormalization is to think about what you are trying to achieve, and starting from there, moving backwards to figuring out how to save data. -- Ikai Lan Developer Programs Engineer, Google App Engine plus.ikailan.com | twitter.com/ikai On Fri, Aug 5, 2011 at 12:13 PM, Ikai Lan (Google) ika...@google.comwrote: I didn't mean to suggest that. Yes, a fanout is potentially bad, but the problem with the normalized approach is that you equally optimize for reads and writes. In the address book example, I update my address book about 1 time every 3 years. I read my address book 20 times a day. I think it's fair to pay the cost of a fan out on change because a change is so infrequent. Normalization has its benefits: it's harder for things to get out of sync, which is ALWAYS a risk with denormalization. A denormalized solution tends to favor eventually consistency approaches over strongly consistent approaches. My point is that every app can be built in a denormalized approach, and in the majority of cases, you actually *want* to build your app in this approach, not the other way around. -- Ikai Lan Developer Programs Engineer, Google App Engine plus.ikailan.com | twitter.com/ikai On Fri, Aug 5, 2011 at 12:00 PM, JT jem...@gmail.com wrote: William, You might want to go over this http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/mapreduce-osdi04.pdf, and come back again with any questions. Ikai and possibly others were trying to convey to you that bigtable approach is more scalable than relational approach. If it works for Google, why woudn't it work for you? Do you have a larger data than Google? On Fri, Aug 5, 2011 at 2:45 PM, William Levesque billleves...@gmail.comwrote: I was trying to explain that with... If someones address is denormalized into 1000 contact records, then when the user updates their address the system has to go out to all of the contact records and update them as well. And this gets multiplied by every complex relationship that exists in the data. And redundant fields can increase data size exponentially. But is Google's position that all data should be denormalized? -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine-java/-/W4INlksPkb0J. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
I am not sure what you meant by fan out and fan in but I agree with you that in relational world, data are more consistent as they are stored and enforced by constraints etc. but demoralized form does not requires joins, which makes them more scalable as less overhead. If one high level entity exits in multiple groups, yes it is waste of (google's) storage space and more update are needed but isn't map reduce had proven that it is still less intensive than table joins? Sent from my HTC on the Now Network from Sprint! - Reply message - From: Ikai Lan (Google) ika...@google.com Date: Fri, Aug 5, 2011 3:13 pm Subject: [appengine-java] Joins! To: google-appengine-java@googlegroups.com -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
Re: [appengine-java] Joins!
On Fri, Aug 5, 2011 at 11:45 AM, William Levesque billleves...@gmail.com wrote: But is Google's position that all data should be denormalized? I don't think anyone would say that. I wrote up my thoughts around this subject here: http://blog.similarity.com/post/7541938593/how-to-build-an-online-dating-site-nosql-edition The upshot is that we've been conditioned by SQL theorists to believe that there is a proper way of modeling data; that the data itself defines the schema and the magic of the RDBMS behind the curtain makes it fast. Unfortunately, this is a lie. It worked to a point but the traffic demands of a mass consumer application have vastly outstripped the RDBMS. You're back to figuring out how to optimize your schema for your particular query profile. So the answer is not denormalize everything, it's denormalize the right things. And the right things will vary from application to application. You just have to build up a correct mental model of how the datastore performs and then design your application accordingly. Jeff -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
[appengine-java] Joins and Persistence - How?
Greetings I am trying to figure out a best-practice for avoiding the use of joins. So far, it isn't going too well so here I am asking if anyone is up for sharing their experience or linking to the reference I failed to find. :) An example of my requirement is this: We have two classes: Person --- + listData + listPerson Data + person The domain model is not set in stone, but merely to illustrate that a Person somehow have a list of Friends in his listPerson attribute and that he somehow have a list of his own data in the listData. Now, getting the Data for a single Person is trivial. But, getting the Data for a Persons friends is not. I cannot get around somehow having to: 1) Fetch the Person (1 query) 2) Fetch all Person in listPerson (n queries per friend) 3) Fetch all Data by the list provided in 2) by combining all the friends listData fields (1 query) The problem is the (n queries) in step 2, which I belive would be a killer and eventually be a deadend when people have more than 30 friends. Any hints? :) -- You received this message because you are subscribed to the Google Groups Google App Engine for Java group. To post to this group, send email to google-appengine-j...@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.