[google-appengine] Re: list-property, many to many, 5000 indexes...HELP!

stefoid Wed, 07 Jul 2010 22:21:21 -0700

Youre right, it does sound complex.  Im not really sure what you're
talking about, but I have yet to view the IO pipeline video, so Ill do
that first.




On Jul 8, 11:22 am, Martin Webb <[email protected]> wrote:
> The way to do it is if you think of books and friends as streams.
>
> This way they are both the same model and you only need one model/api to 
> handle
> both
>
> A book i have read is a stream
> and a friend is a stream.
>
> All you now do is subscribe users to friends streams
> and users to book streams or book tag streams. you can also map streams to
> streams which then allows tags to include other tags - if you really want to 
> get
> nifty.
>
> So the tag - Horror is like a stream and n number of users can subscribe to 
> its
> stream. the stream can then have a path stream/books/horror which then can
> return the activity of that stream and the same for a user use a path as the
> stream name stream/user/martin - which of course would now be my activity
> stream.
>
> This is basically done using the model that you have been give so far.
> #parent
> class stream(db.Model):
>     #key_name=path
>     Count=db.IntegerProperty(default=0) #how many lists we have used - n
>  #child
> class stream_index(db.Model):
>     subscribers = db.StringListProperty()  - dont use your user name use a
> stream path - like stream/user/martin
>
> all you now do is create your stream and append your subscribers to it in the
> list property
>
> BUT this brings a whole new contention issue. As the solution above allows
> streams to have millions of subscribers.If they all subscribe or unsubscribe 
> you
> may have update issues on the same property - which is why we use sharded
> counters for counts. To work around this you use a pipeline. 2010 IO did a 
> whole
> 1.5hr chat on this and its good stuff. If you watch it a few times its simple 
> to
> grasp and Brets solution can be used for this proposal.
>
> basically rather than append your users to your steam as you would normally
>
> subscribers.append(user.name)
>
> You append them to a work-line - basically a pipe line of work that is picked
> off by a task queue and then batched for updation in your streams. the work is
> then removed from the work line.
>
> class work_line(db.Model):
>
>     work_index=db.StringProperty() - the work index is a key that depicts the
> task that is fired and what stream that task is for
>     subscriber=db.StringProperty() #+martin or -martin
>
> The above is a simple work line that simply stores the user name to be
> subscribed  + or unsubscribe -
>
> This sounds all very complex and possibly it is - the good news as i am 
> building
> a stream subscription service using a pipeline
>
> view the latest 2010 IO PIPELINES and 2009 building scalable web apps - the
> links are posted all over this group.
>
> Regards
>
> Martin Webb
> The information contained in this email is confidential and may contain
> proprietary information. It is meant solely for the intended recipient. Access
> to this email by anyone else is unauthorised. If you are not the intended
> recipient, any disclosure, copying, distribution or any action taken or 
> omitted
> in reliance on this, is prohibited and may be unlawful. No liability or
> responsibility is accepted if information or data is, for whatever reason
> corrupted or does not reach its intended recipient. No warranty is given that
> this email is free of viruses. The views expressed in this email are, unless
> otherwise stated, those of the author
>
> ________________________________
> From: Jeff Schwartz <[email protected]>
> To: [email protected]
> Sent: Wed, 7 July, 2010 18:56:10
> Subject: Re: [google-appengine] list-property, many to many, 5000  
> indexes...HELP!
>
> I wouldn't put the list attribute in your book class due to serialization 
> issues
> and exploding indexes. I'd put the list in a sub class of book and query the
> subclass to return a list of entity keys on a match. As child keys contain
> parent keys as well, I'd use the list of keys returned to grab the actual book
> entities. You can refine this as much as you like.
>
> Jeff
>
> On Wed, Jul 7, 2010 at 1:58 AM, stefoid <[email protected]> wrote:
>
> Hi I think I have a fair handle on how Datastore works, but I need to
>
>
>
> >check, and I need some help with a design
>
> >lets say I have:
> >a very large list of BOOKS
> >the BOOKS are tagged- "adventure+romance",   "historical+drama",
> >"animation+sci-fi+comedy", etc...
> >and I have a very large list of USERS.
> >USERS can read a very large number of BOOKS
> >And I, as a USER, can have a very large number of USERS who are my
> >friend.
>
> >What I want , in english, is
> >;  Return a list of all the BOOKS tagged with "sci-fi" and "romance"
> >that have been read by USERS who are my friend, sorted by Most
> >Frequently Read.
>
> >Now, I know I can model tags with a ListProperty, so that filtering is
> >easy = "WHERE tag AND tag AND tag..."
> >And I know that I can sort Books by Most Frequently Read  quite easily
>
> >But due to the large number of BOOKS that could be read by  a USER,
> >and the large number of USERS who could be my friend, I cant
> >practically model  BOOKS READ BY USERS and  USERS WHO ARE MY FRIEND as
> >ListProperties...
>
> >...Because I would blow the 5000 index limit per entity, because each
> >value of ListPropery generates its own index entry.  Is that correct?
>
> >I cant find a way to handle this without resorting to filtering in
> >memory.  Am I dumb, or is that just the way it is?
>
> >My solution is:
>
> >BOOK:
> >numerOfReads
> >TagList
> >otherStuff
>
> >Then I look up a relationship entity to find out everyone who is a
> >FRIEND of mine.
> >FRIENDS:
> >userKey
> >friendKey
>
> >Then I look up a relationship entity to find out which BOOKS that
> >USERS have read
> >BOOKSREAD:
> >userKey
> >bookKey
> >TagList
> >numberOfReads
>
> >So first I pull into memory the list of my particular friendKeys from
> >the FRIENDS table.
>
> >Then I churn through the table of BOOKSREAD, using the duplicated
> >properties TagList and numberOfReads to filter on tags and sort
> >according to number of reads.
>
> >I then filter this list of BOOKS *in memory* by the friendKeys I
> >have.  Ill need some sort of algorithm to keep pulling chunks of BOOKS
> >from the database until I have a nice page of books  my friends have
> >read (20 books displayed per page) to pass to the GUI.
>
> >????
>
> >--
> >You received this message because you are subscribed to the Google Groups
> >"Google App Engine" group.
> >To post to this group, send email to [email protected].
> >To unsubscribe from this group, send email to
> >[email protected].
> >For more options, visit this group at
> >http://groups.google.com/group/google-appengine?hl=en.
>
> --
> --
> Jeff
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group 
> athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: list-property, many to many, 5000 indexes...HELP!

Reply via email to