Re: [google-appengine] how reduce memory usage on mapreduce controller callback

Jason Galea Mon, 14 Oct 2013 03:14:39 -0700

Hi Moises,

you can find all the details here..
https://developers.google.com/appengine/docs/python/ndb/cache


"The In-Context Cache

The in-context cache persists only for the duration of a single incoming
HTTP request and is "visible" only to the code that handles that request.
It's fast; this cache lives in memory. When an NDB function writes to the
Datastore, it also writes to the in-context cache. When an NDB function
reads an entity, it checks the in-context cache first. If the entity is
found there, no Datastore interaction takes place."

My take-away is that the in-context cache is handy when different parts of
your code are calling get() on the same entities and would certainly make
things faster, but comes with the trade-off that all entities you touch are
staying in memory until the request completes, even if you don't need them
any more.

With queries you're going to be loading all entities regardless so
disabling the in-context cache alone likely won't help much. If you do a
keys_only query, though, and get() each entity in turn, then disabling the
in-context cache should reduce memory usage (assuming that some/all of the
memory used by previous entities is able to be re-used..). Once again,
though, you'll likely be sacrificing speed for less memory usage.

This is mostly based on how I believe the different parts would/should
work, I have no hard evidence..

Jason





On Mon, Oct 14, 2013 at 5:36 PM, Moises Belchin <moisesbelc...@gmail.com>wrote:

> Hi Jason,
>
> Thanks for the detailed answer. I'm very surprised that no one else is
> talking about these issues.
>
> I'm using ndb and my appstats are off. I could see a incredible great
> improvement in my app when I turned off the stats. So, I recommend people
> only use stats for testing or debug.
>
> As you mentioned I'll try projection queries.
>
> Could you explain something more about NDB in-memory cache?
>
> Thanks again.
>
>
>
>
> Saludos.
> Moisés Belchín.
>
>
> 2013/10/12 Jason Galea <ja...@lecstor.com>
>
>> Hi Moises,
>>
>> we're currently trying to deal with this issue too. Not in mapreduce,
>> just regular handlers.
>>
>> same here - "Python 2.7 and F2 instances with 256MB"
>>
>> Early on I found that fetching 1000 entities and looping through them to
>> update a property would blow the instance. Reducing this to say 100 fixed
>> the issue. (than when create another task do do the next 100)
>>
>> "After handling this request" - as far as I understand this isn't so bad
>> unless you're blowing it on every request and starting a new instance is
>> detrimental.
>>
>> "While handling this request" - this concerns us most as the request does
>> not complete and breaks stuff.. and we see far too many of them.
>>
>> I've spent more than a little time trying to work out what is causing the
>> blowouts but as far as I've been able to work out, memory usage and what
>> causes it is near impossible (or just very, very hard).
>>
>> Are you using NDB? if so..
>> - you could try disabling the in-memory cache. As I see it, even though
>> you only access one entity at a time, NDB's in memory cache will store them
>> all until the request is completed.
>> - you could try projection queries if you don't need the complete object
>> (or possibly even if you do). Projection queries get their data from an
>> index and the entities returned cannot be put() so I assume they are not
>> cached at all. We're trialling some fixes with these atm.
>>
>> ** If anyone knows any of this is incorrect, please let me know..
>>
>> I'm actually surprised there is not more discussion of these issues from
>> what we have experienced so maybe we're doing something fundamentally
>> wrong, but I don't believe so.
>>
>> oh, is appstats turned on? I believe the most noticeable improvement
>> we've seen was when we turned it off..
>>
>> regards,
>>
>> Jason
>>
>>
>>
>>
>>
>>
>> On Fri, Oct 11, 2013 at 5:51 PM, Moises Belchin 
>> <moisesbelc...@gmail.com>wrote:
>>
>>> Hi Vinny,
>>>
>>> Thanks for the tips, but actually I'm not loading a file. I'm only using
>>> mapreduce lib for read all the entities for one of my kinds, work with them
>>> (I only read some properties to compose the csv line format) and then I
>>> write to CSV file on cloud storage using mapreduce FileOutputWriter.
>>>
>>> Any idea why I'm getting this Criticals memory errors?
>>>
>>> Thanks all again.
>>>
>>>
>>> Saludos.
>>> Moisés Belchín.
>>>
>>>
>>> 2013/10/10 Vinny P <vinny...@gmail.com>
>>>
>>>> On Thu, Oct 10, 2013 at 5:47 AM, Moises Belchin <
>>>> moisesbelc...@gmail.com> wrote:
>>>>
>>>>> I'm getting a lot of memory limit critical errors in my app when I use
>>>>> mapreduce library.
>>>>>  I'm working with Python 2.7, app engine 1.8.5 and F2 instances with
>>>>> 256MB.
>>>>>
>>>>> Has anyone got these critical errors?
>>>>>
>>>>>
>>>>
>>>> Hello Moises,
>>>>
>>>> How large is your input file? Are you loading the entire file into
>>>> memory at once? If so, try moving to a BlobstoreLineInputReader which reads
>>>> in one line at a time - it reduces the memory being used during file
>>>> processing.
>>>>
>>>>
>>>> -----------------
>>>> -Vinny P
>>>> Technology & Media Advisor
>>>> Chicago, IL
>>>>
>>>> App Engine Code Samples: http://www.learntogoogleit.com
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Google App Engine" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to google-appengine+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to google-appengine@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/google-appengine.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "Google App Engine" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to google-appengine+unsubscr...@googlegroups.com.
>>> To post to this group, send email to google-appengine@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/google-appengine.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>
>>
>> --
>> Jason Galea
>> lecstor.com
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to google-appengine+unsubscr...@googlegroups.com.
>> To post to this group, send email to google-appengine@googlegroups.com.
>> Visit this group at http://groups.google.com/group/google-appengine.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to google-appengine+unsubscr...@googlegroups.com.
> To post to this group, send email to google-appengine@googlegroups.com.
> Visit this group at http://groups.google.com/group/google-appengine.
> For more options, visit https://groups.google.com/groups/opt_out.
>



-- 
Jason Galea
lecstor.com

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [google-appengine] how reduce memory usage on mapreduce controller callback

Reply via email to