Hey Jon,
  Emails, users, and ip addresses (typically) look pretty different,
mixing them is probably not a big deal;  So I suggested not creating
separate models for emails, IPs, and users names to reduce the number
of indexes you might need in the future.  For example, if you were
have an 'active' property on the page that you needed to include while
searching pages you could include that 'active' property on your
PageIndex model too.  Then you would only need one composite index
instead of three.  This may not matter for your application.  It also
lets you (depending on list sizes) search based on, for instance, an
email and an ip and GAE's internal merge-join will automatically be
used.

  I would suggest thinking carefully about how you plan to search your
data _and_ how you might need to maintain the data.  If you will need
to specifically access the lists for specific types then I would
either add the "type" property or maybe even use the key_name.  I have
used both methods;  to choose between them you should consider what
will make your most common use-cases most efficient  and lets you
handle edge cases when needed.

  Hope that helps.

Robert



On Thu, Sep 2, 2010 at 03:39, ogterran <[email protected]> wrote:
> Thanks Robert.
> The ancestor query works great. Read the documentation but i guess it
> didn't click
> I was looking for a children() like function in the parent.
>
> I don't think I can put it in the same search list.
>
> What if I add a field for type?
>
> class PageIndex(db.Model):
>    type = db.StringProperty(required=True)
>    searchlist= db.StringListProperty(required=True)
>
>  type can be emails, ipaddresses, usernames
>  searchlist can be whatever list.
>
>  Is this better?
>
>  Thanks
>  Jon
>
> On Sep 1, 3:29 pm, Robert Kluin <[email protected]> wrote:
>> Hi Jon,
>>   1) When you build a query you can specify the ancestor.
>>        PageIndex.all(keys_only=True).ancestor(thepage)
>>      http://code.google.com/appengine/docs/python/datastore/queryclass.htm...
>>
>>   2) I would not add multiple list properties to the same model.  I
>> would also probably not make different kinds in this case.
>> Personally, depending on exactly how you will need to query and
>> maintain the data, I would either just put each field in the same
>> "search" list or make a PageIndex entity for each type.
>>
>> Robert
>>
>>
>>
>> On Wed, Sep 1, 2010 at 02:39, ogterran <[email protected]> wrote:
>> > Couple more questions
>> > 1. If I have the parent entity, how do get all the child entities?
>> > 2. If I have more fan out lists, so usernames is one, but now i have
>> > emails and ipaddresses
>> >    is it more efficient to create different entity kinds or just add
>> > another list to PageIndex?
>> >   i.e.
>> > class PageIndex(db.Model):
>> >     usernames = db.StringListProperty(required=True)
>> >     emails = db.StringListProperty(required=True)
>> >     ipaddreses= db.StringListProperty(required=True)
>>
>> > or
>>
>> > class PageIndex(db.Model):
>> >     usernames = db.StringListProperty(required=True)
>> > class PageEmailIndex(db.Model):
>> >     emails = db.StringListProperty(required=True)
>> > class PageIpAddressIndex(db.Model):
>> >     ipaddreses= db.StringListProperty(required=True)
>>
>> > Thanks
>> > Jon
>>
>> > On Aug 31, 11:19 pm, ogterran <[email protected]> wrote:
>> >> Thanks guys for all the responses.
>> >> I checked out Brett's presentation and he talks about this exact issue
>> >> on how to optimize using list properties
>>
>> >> So following the Brett's presentation, create 2 entitiy kinds.
>> >> Page and PageIndex where Page is the parent of PageIndex (specify
>> >> parent in PageIndex constructor)
>>
>> >> class Page(db.Model):
>> >>     pagekey= db.StringProperty(required=True)
>>
>> >> class PageIndex(db.Model):
>> >>     usernames = db.StringListProperty(required=True)
>>
>> >> indexes = db.GqlQuery("SELECT __key__ FROM MessageIndex "
>> >>                                    "WHERE usernames = :1", username)
>> >> keys = [k.parent() for k in indexes]
>> >> pages = db.get(keys)
>>
>> >> Since we are querying by key, it is 10x faster, and no unnecessary
>> >> serialization.
>> >> We can fan out by adding multiple PageIndex, if reaches 5000 max.
>>
>> >> Is this about right?
>>
>> >> Thanks
>> >> Jon
>>
>> >> On Aug 31, 9:34 am, Jeff Schwartz <[email protected]> wrote:
>>
>> >> > A list property's size is limited to 5000.
>>
>> >> > If you only want to know if a user has visited a page, you can use the 
>> >> > first
>> >> > model & query it returning only keys to avoid list serialization 
>> >> > issues. In
>> >> > addition, key only queries are very fast.
>>
>> >> > Additionally, if your Page model's key had a string name that was the
>> >> > pagecode then you could even use the key to identify the page. In other
>> >> > words, you'd be materializing the view in the key of the model. This 
>> >> > would
>> >> > eliminate the need to serialize the entity entirely when it is used for
>> >> > lookup.
>>
>> >> > Writing out the entity after having updated its list property is another
>> >> > story all together as writes are subject to fail due to contention. To
>> >> > reduce the odds of contention you can shard your model by a factor of 
>> >> > 10 or
>> >> > 20 to reduce but not eliminate the possibility of contention.
>>
>> >> > There are a number of Google IO videos on YouTube that you can watch 
>> >> > which
>> >> > cover these techniques.
>>
>> >> > Jeff
>>
>> >> > On Tue, Aug 31, 2010 at 5:22 AM, ogterran <[email protected]> 
>> >> > wrote:
>> >> > > Hi,
>>
>> >> > > I have a choice to store the data two ways. Which way is more
>> >> > > efficient when querying in BigTable?
>>
>> >> > > First way:
>> >> > > class Page(db.Model):
>> >> > >    pagekey= db.StringProperty(required=True)
>> >> > >    usernames = db.StringListProperty(required=True)
>>
>> >> > > So all the users who visits the page, will be added to the username
>> >> > > list
>>
>> >> > > Second way:
>>
>> >> > > class Page(db.Model):
>> >> > >    pagekey= db.StringProperty(required=True)
>> >> > >    username = db.StringProperty(required=True)
>> >> > > All the users who visits the page will have its own row
>>
>> >> > > The query will be with both username and pagekey.
>> >> > > There can be a lot of users.
>> >> > > Is there a limit on theStringListsize?
>> >> > > what are the advantages of each methods of  storing data?
>>
>> >> > > Thanks
>> >> > > Jon
>>
>> >> > > --
>> >> > > You received this message because you are subscribed to the Google 
>> >> > > Groups
>> >> > > "Google App Engine" group.
>> >> > > To post to this group, send email to 
>> >> > > [email protected].
>> >> > > To unsubscribe from this group, send email to
>> >> > > [email protected]<google-appengine%2Bunsubscrib
>> >> > >  [email protected]>
>> >> > > .
>> >> > > For more options, visit this group at
>> >> > >http://groups.google.com/group/google-appengine?hl=en.
>>
>> >> > --
>> >> > --
>> >> > Jeff
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups 
>> > "Google App Engine" group.
>> > To post to this group, send email to [email protected].
>> > To unsubscribe from this group, send email to 
>> > [email protected].
>> > For more options, visit this group 
>> > athttp://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to