Oh man, I didn't realize that Eishay's work got turned into a whole Google
code project. Awesome to see what happens when you're curious, pursue
something, then write and tweet about it.

On Thu, Apr 1, 2010 at 12:47 PM, Jeff Schnitzer <[email protected]> wrote:

> It's an interesting thread.  To summarize, you're discussing the best
> way to serialize a 15,000-element dictionary in an entity.
>
> This doesn't seem to be all that closely related to the datastore.
> The datastore api costs are the same no matter how you store it (58ms)
> but the serialization costs vary widely depending on what technique
> you use - pickling, protobuf, expando (you can put 15k properties in
> an expando??).
>
> I guess having a 'select fields' (or 'suppress fields') instruction
> could let you avoid the serialization costs for specific fields, but
> it would require a dramatically more complicated API.  On the other
> hand, you can pretty easily address this as you describe; saving your
> data as Text or Blob and serializing or deserializing it on-demand.
>
> At least in the numbers you posted, there weren't any bandwidth issues
> - if I'm reading it correctly, all of the cost seems to have been
> produced by the serialization process.  It doesn't matter if you fetch
> the Blob data or not, it only matters when you try to convert it into
> a dictionary.  Do the serialization lazily and your problem is solved.
>
> BTW, this is related information that always deserves a link when the
> subject comes up:
>
> http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
>
> Jeff
>
> On Thu, Apr 1, 2010 at 12:03 PM, Eli Jones <[email protected]> wrote:
> > Here are the numbers from a test I did just comparing the process of
> > stuffing the protobuf of an entity into a BlobProperty:
> >
> http://groups.google.com/group/google-appengine-python/browse_thread/thread/4d6dde610addd8ef/a3037abd34ed03f6?#a3037abd34ed03f6
> > It's 40% to 70% faster and cheaper to do a get, change, put on the
> protobuf
> > version.. (see link for model definitions and test code and results).
> > Using compression with the protobuf made it a little faster and cheaper..
> > but the benefit was not as drastic as the protobuf change.
> >
> > On Thu, Apr 1, 2010 at 2:02 PM, Jeff Schnitzer <[email protected]>
> wrote:
> >>
> >> Do you have any quantitative numbers?  I'd really like to know how
> >> much this saves.
> >>
> >> Jeff
> >>
> >> On Thu, Apr 1, 2010 at 9:25 AM, Eli Jones <[email protected]> wrote:
> >> > Compression will probably show the biggest benefit when you have a
> >> > significant difference between the un-compressed and compressed sizes
> >> > and
> >> > the amount of data you're bringing over the wire is large enough to
> >> > cause a
> >> > noticeable slowdown..
> >> > So..if all you are doing is pulling over one entity with properties
> that
> >> > add
> >> > up to 200 bytes total.. compression is just going to slow you down...
> >> > If you are pulling over an entity that has properties of 200 kilobytes
> >> > total, the time it takes to get the entity decompress it, change it,
> >> > compress it and put it back to the datastore will be faster than just
> >> > getting, changing and putting a non-compressed version.
> >> > Granted my assumptions in this case are based on tests I did against
> two
> >> > Models: a decompressed version with two large un-indexed properties
> and
> >> > then
> >> > a compressed Model that had one property = a compressed protobuf of
> the
> >> > decompressed Model.  (The protobuf before compression was 390KB in
> >> > size..
> >> > after compression it was 40KB).  Even without using compression on the
> >> > protobuf of the big model.. just putting one large property
> (containing
> >> > a
> >> > protobuf of the two property model) was much faster than directly
> >> > putting
> >> > the two prop Model (even with indexed = false for the props).
> >> > After adding compression to the mix, the roundtrip of getting,
> >> > decompressing, changing, compressing and putting took less time and
> less
> >> > cpu
> >> > overall.  Than getting, changing, putting the un-compressed protobuf..
> >> >  never mind comparing it to putting a larger entity with multiple
> >> > defined
> >> > (but un-indexed) properties.
> >> > In the end, compression may be less important than just stuffing the
> an
> >> > entire entity into a BlobProperty as a protobuf. (I seem to remember
> >> > that
> >> > the benefit from compressing the large protobuf was nowhere near as
> >> > drastic
> >> > as the benefit from turning the entire entity into a protobuf to be
> >> > stuffed
> >> > into one prop.)
> >> > On Thu, Apr 1, 2010 at 2:01 AM, Jeff Schnitzer <[email protected]>
> >> > wrote:
> >> >>
> >> >> On Wed, Mar 31, 2010 at 9:57 PM, Robert Kluin <
> [email protected]>
> >> >> wrote:
> >> >> >
> >> >> >  Although I have not personally tested with _really_ large
> entities,
> >> >> > I see very little difference in performance based on the size of
> the
> >> >> > entity.  We have some models with 20 or 30 string and float fields
> >> >> > that seem to perform similar to models with 5 or 6 string fields.
> >> >> > There have been a number of threads discussing this in the past.  I
> >> >> > think a post had some benchmarks in December or January.
> >> >>
> >> >> This has been my experience as well.  Additional indexes cost a lot,
> >> >> but additional unindexed properties seem to be almost "free" in the
> >> >> datastore.
> >> >>
> >> >> I would ask of anyone asking for select at a property level:  Have
> you
> >> >> run any performance tests of your application with big vs small
> >> >> entities?  Are you sure it matters?
> >> >>
> >> >> Jeff
> >> >>
> >> >> --
> >> >> You received this message because you are subscribed to the Google
> >> >> Groups
> >> >> "Google App Engine" group.
> >> >> To post to this group, send email to
> [email protected].
> >> >> To unsubscribe from this group, send email to
> >> >> [email protected]<google-appengine%[email protected]>
> .
> >> >> For more options, visit this group at
> >> >> http://groups.google.com/group/google-appengine?hl=en.
> >> >>
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> > Groups
> >> > "Google App Engine" group.
> >> > To post to this group, send email to
> [email protected].
> >> > To unsubscribe from this group, send email to
> >> > [email protected]<google-appengine%[email protected]>
> .
> >> > For more options, visit this group at
> >> > http://groups.google.com/group/google-appengine?hl=en.
> >> >
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "Google App Engine" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to
> >> [email protected]<google-appengine%[email protected]>
> .
> >> For more options, visit this group at
> >> http://groups.google.com/group/google-appengine?hl=en.
> >>
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected]<google-appengine%[email protected]>
> .
> > For more options, visit this group at
> > http://groups.google.com/group/google-appengine?hl=en.
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>


-- 
Ikai Lan
Developer Programs Engineer, Google App Engine
http://googleappengine.blogspot.com | http://twitter.com/app_engine

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to