OK! I did not solve this issue for GAE. However, I moved the code from this module to a GCE instance and it now runs successfully (3 for 3 so far). I did the following:
1. Copied all code (including dependencies) to Google Cloud Storage bucket. 2. Created shell bash script to be run as a startup script by GCE instance that included downloading all required libraries and the code from the bucket. It also copies the syslog to another bucket upon complete. Put this in its own bucket on GS. Side note: the adspygoogle API code had a lot of logging at the info level if you make a lot of calls. I forked a copy and modified the log level to debug. example library download (used default debian-7-wheezy-v20130926 image): apt-get -y install python-mysqldb example copy from GS bucket via metadata. The cs-bucket field is just the string value of the name of your bucket. since the gsutil copied the files into a folder with the name of the bucket I had to then copy the files into my working directory in order to access it while running my main code. I very much like working with the preinstalled gsutil - would love to see a similar utility for Cloud SQL that does more than administration (e.g. can query the databases). DEST_DIR=$(pwd) CS_BUCKET=$(curl > http://metadata/computeMetadata/v1beta1/instance/attributes/cs-bucket) gsutil cp -R gs://$CS_BUCKET $DEST_DIR cp -r $CS_BUCKET/. $DEST_DIR 3. Reserved an external IP on Compute. I need the external IP to support Cloud SQL access. I authorized this IP explicity per instructions on Cloud SQL here https://developers.google.com/cloud-sql/docs/external. 4. Modified the module code on GAE to just start a GCE instance. Useful information found here https://github.com/GoogleCloudPlatform/compute-getting-started-python. I add metadata for the GS bucket locations (startup script and code) and I added the external IP (natIP) to the instance and I toned down some of the logging (the startup script takes a while to run, so I increased the buffer time between attempts to shut down the instance). If using the gce file from the information above, then adding an external IP looks something like this, where external_ip is the string of the address: instance['networkInterfaces'] = [{ 'accessConfigs': [{'type': 'ONE_TO_ONE_NAT', 'name': 'External NAT', > 'natIP': external_ip}], 'network': '%s/global/networks/%s' % (self.project_url, network)}] Wish I knew more about what caused this to fail on GAE. It may have been the volume of calls to either the database or the AdWords API, but my guess would be some variable(s) hogging up more and more memory as the script ran. Thanks again for the help. Will On Tuesday, November 12, 2013 2:14:27 PM UTC-8, [email protected] wrote: > > Sorry for the temporary radio silence. I was trying to get an AppStats log > for my local machine to compare. Absent that, qualitatively from running > this many times I'd say the times are pretty close between local and GAE > hosted (even with pretty substantial debug logging locally). The GAE stats > came it at 37ms on average and that's got to be about right. > > I'm going to attempt to: > > 1. Break this script down into smaller components and run them > sequentially. The hope here is that the garbage collector helps with any > monster memory variables. > 2. Lower my urlfetch default deadline and explicitly handle url > deadline exceeded errors within my adwords batches. The theory here is > that > GAE is killing jobs that consistently exceed some arbitrary urlfetch > deadline independent of what is being set by an individual application. > Not > necessarily a bad thing. > 3. Move these pieces from a module to a Google Compute Engine instance. > > While it's not ideal working without an explicitly identified issue, I've > got to do what I can to get this consistently running. I'm still 50% at > best. > > Thanks again for all your help. > > Will > > On Friday, November 8, 2013 8:34:03 PM UTC-8, Vinny P wrote: >> >> On Fri, Nov 8, 2013 at 5:06 PM, <[email protected]> wrote: >> >>> OK - here's what I see. The script ran two out of two times successfully >>> with AppStats (of course). There are many, many rdbc and AdWords API calls. >>> However, the longest real time spent was on a single AdWords API call for >>> 60 seconds. >>> >> >> >> If you repeat this AdWords API call from your local machine, does it take >> the same amount of time or less? >> >> >> On Fri, Nov 8, 2013 at 5:06 PM, <[email protected]> wrote: >> >>> >>> 1. The first time I successfully ran the script with appstats I got >>> an "Invalid or stale record" error when trying to view the timeline. I >>> made >>> no changes, waited about an hour, then reran the script and it worked... >>> >>> >> >> Yes, AppStats occasionally needs some time to initialize. Feel free to >> leave the AppStats recorder on - it doesn't take up much memory and it >> might catch some interesting info sooner or later. >> >> >> ----------------- >> -Vinny P >> Technology & Media Advisor >> Chicago, IL >> >> App Engine Code Samples: http://www.learntogoogleit.com >> >> > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/google-appengine. For more options, visit https://groups.google.com/groups/opt_out.
