There should be no actual need to mass-put a new property to all of your 
entities, and set that new property to a default value since the Datastore 
supports entities with and without set property values (as you have noticed 
with the failed Map Reduce job). 

You can assume that if an entity does not have the property, that it is 
equal to the default "indexed=0". You can then set this value directly in 
your application during read time. If it exists, read it and use it, else 
use a hard-coded default and set the value then in  your code (aka only 
when the entity is being read).

Updating existing entities is documented here 
<https://cloud.google.com/appengine/articles/update_schema#updating-existing-entities>.
 



Without knowing what happened exactly, it is not possible to know the 
reason for 70M reads. However, I would recommend to view this post 
<https://stackoverflow.com/a/15946970> which might answer your question.


On Friday, August 11, 2017 at 9:02:53 AM UTC-4, Filipe Caldas wrote:
>
> Hi,
>
>   I am currently trying to update a kind in my database and add a field 
> (indexed=0), the table has more than 10M entities.
>
>   I tried to use MapReduce for appengine and launched a fairly simple job 
> where the mapper only sets the property and yields an operation.db.Put(), 
> the only problem is that some of the shards failed, so the job was stopped 
> and automatically restarted.
>
>   Problem is, launching this job on 10M entities cost me about $ 100 and 
> the job was not finished (the retry was going slow so don't think they 
> billed much for that). 
>   
> The extra annoying thing is that there is no other way that I know to 
> update these properties "fast" enough (the mapreduce took over 7 hours to 
> fail on 10M). I know Beam/Dataflow is apparently the way to go, but 
> documentation on doing basic operations like updating Datastore entities is 
> still very poor (not sure if can even be done).
>
>   So, my question is is there a fast and *safe* way to update  my entities 
> that does not consist of doing 10M fetchs and puts in sequence?
>
>   Bonus question: do anyone know why was I billed 70M reads on only 10M 
> entities?
>
> Best regards,
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/363ad00e-0345-46b1-b55a-dedb6d36c573%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
  • [google-appengine] Re... 'Shivam(Google Cloud Support)' via Google App Engine

Reply via email to