Brian (apologies if that is not your name),

How much of the costs are instance hours versus datastore writes? There's
probably something going on here. The largest costs are to update indexes,
not entities. Assuming $6500 is the cost of datastore writes alone, that
breaks down to:

~$0.0004 a write

Pricing is $0.10 per 100k operations, so that means using this equation:

(6500.00 / 14000000) / (0.10 / 100000)

You're doing about 464 write operations per put, which roughly translates
to 6.5 billion writes.

I'm trying to extrapolate what you are doing, and it sounds like you are
doing full text indexing or something similar ... and having to update all
the indexes. When you update a property, it takes a certain amount of
writes. Assuming you are changing String properties, each property you
update takes this many writes:

- 2 indexes deleted (ascending and descending)
- 2 indexes update (ascending and descending)

So if you were only updating all the list properties, that means you are
updating 100 list properties.

Given that this is a regular thing you need to do, perhaps there is an
engineering solution for what you are trying to do that will be more cost
effective. Can you describe why you're running this job? What features does
this support in your product?

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com | twitter.com/ikai



On Thu, Jan 5, 2012 at 10:08 AM, Petey <[email protected]> wrote:

> In this one case we had to change all of the items in the
> listproperty. In our most common case we might have to add and delete
> a couple items to the list property every once in a while. That would
> still cost us well over $1,000 each time.
>
> Most of the reasons for this type of data in our product is to
> compensate for the fact that there isn't full text search yet. I know
> they are beta testing full text, but I'm still worried that that also
> might be too expensive per write.
>
> On Jan 5, 6:54 am, Richard Watson <[email protected]> wrote:
> > A couple thoughts.
> >
> > Maybe the GAE team should borrow the idea of spot prices from Amazon.
> > That's a great way to have lower-priority jobs that can run when there
> are
> > instances available. We set the price we're willing to pay, if the spot
> > cost drops below that, we get the resources. It creates a market where
> more
> > urgent jobs get done sooner and Google makes better use of quiet periods.
> >
> > On your issue:
> > Do you need to update every entity when you do this? How many items on
> the
> > listproperty need to be changed? Could you tell us a bit more of what the
> > data looks like?
> >
> > I'm thinking that 14 million entities x 18 items each is the amount of
> > entries you really have, each distributed across at least 3 servers and
> > then indexed. That seems like a lot of writes if you're re-writing
> > everything.  It's likely a bad idea to rely on an infrastructure change
> to
> > fix this (recurring) issue, but there is hopefully a way to reduce the
> > amount of writes you have to do.
> >
> > Also, could you maybe run your mapreduce on smaller sets of the data to
> > spread it out over multiple days and avoid adding too many instances? Has
> > anyone done anything like this?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to