OK! I did not solve this issue for GAE. However, I moved the code from this 
module to a GCE instance and it now runs successfully (3 for 3 so far). I 
did the following:

1. Copied all code (including dependencies) to Google Cloud Storage bucket.

2. Created shell bash script to be run as a startup script by GCE instance 
that included downloading all required libraries and the code from the 
bucket. It also copies the syslog to another bucket upon complete. Put this 
in its own bucket on GS. Side note: the adspygoogle API code had a lot of 
logging at the info level if you make a lot of calls. I forked a copy and 
modified the log level to debug.

example library download (used default debian-7-wheezy-v20130926 image): 

apt-get -y install python-mysqldb

example copy from GS bucket via metadata. The cs-bucket field is just the 
string value of the name of your bucket. since the gsutil copied the files 
into a folder with the name of the bucket I had to then copy the files into 
my working directory in order to access it while running my main code. I 
very much like working with the preinstalled gsutil - would love to see a 
similar utility for Cloud SQL that does more than administration (e.g. can 
query the databases).

DEST_DIR=$(pwd)

CS_BUCKET=$(curl 
> http://metadata/computeMetadata/v1beta1/instance/attributes/cs-bucket)

gsutil cp -R gs://$CS_BUCKET $DEST_DIR

cp -r $CS_BUCKET/. $DEST_DIR

3. Reserved an external IP on Compute. I need the external IP to support 
Cloud SQL access. I authorized this IP explicity per instructions on Cloud 
SQL here https://developers.google.com/cloud-sql/docs/external. 

4. Modified the module code on GAE to just start a GCE instance. Useful 
information found here 
https://github.com/GoogleCloudPlatform/compute-getting-started-python. I 
add metadata for the GS bucket locations (startup script and code) and I 
added the external IP (natIP) to the instance and I toned down some of the 
logging (the startup script takes a while to run, so I increased the buffer 
time between attempts to shut down the instance).

If using the gce file from the information above, then adding an external 
IP looks something like this, where external_ip is the string of the 
address:

instance['networkInterfaces'] = [{

'accessConfigs': [{'type': 'ONE_TO_ONE_NAT', 'name': 'External NAT', 
> 'natIP': external_ip}],

'network': '%s/global/networks/%s' % (self.project_url, network)}]

 
Wish I knew more about what caused this to fail on GAE. It may have been 
the volume of calls to either the database or the AdWords API, but my guess 
would be some variable(s) hogging up more and more memory as the script ran.

Thanks again for the help.

Will

On Tuesday, November 12, 2013 2:14:27 PM UTC-8, [email protected] wrote:
>
> Sorry for the temporary radio silence. I was trying to get an AppStats log 
> for my local machine to compare. Absent that, qualitatively from running 
> this many times I'd say the times are pretty close between local and GAE 
> hosted (even with pretty substantial debug logging locally). The GAE stats 
> came it at 37ms on average and that's got to be about right.
>
> I'm going to attempt to:
>
>    1. Break this script down into smaller components and run them 
>    sequentially. The hope here is that the garbage collector helps with any 
>    monster memory variables.
>    2. Lower my urlfetch default deadline and explicitly handle url 
>    deadline exceeded errors within my adwords batches. The theory here is 
> that 
>    GAE is killing jobs that consistently exceed some arbitrary urlfetch 
>    deadline independent of what is being set by an individual application. 
> Not 
>    necessarily a bad thing.
>    3. Move these pieces from a module to a Google Compute Engine instance.
>
> While it's not ideal working without an explicitly identified issue, I've 
> got to do what I can to get this consistently running. I'm still 50% at 
> best.
>
> Thanks again for all your help.
>
> Will
>
> On Friday, November 8, 2013 8:34:03 PM UTC-8, Vinny P wrote:
>>
>> On Fri, Nov 8, 2013 at 5:06 PM, <[email protected]> wrote:
>>
>>> OK - here's what I see. The script ran two out of two times successfully 
>>> with AppStats (of course). There are many, many rdbc and AdWords API calls. 
>>> However, the longest real time spent was on a single AdWords API call for 
>>> 60 seconds.
>>>
>>
>>
>> If you repeat this AdWords API call from your local machine, does it take 
>> the same amount of time or less?
>>
>>
>> On Fri, Nov 8, 2013 at 5:06 PM, <[email protected]> wrote:
>>
>>>
>>>    1. The first time I successfully ran the script with appstats I got 
>>>    an "Invalid or stale record" error when trying to view the timeline. I 
>>> made 
>>>    no changes, waited about an hour, then reran the script and it worked... 
>>>
>>>
>>
>> Yes, AppStats occasionally needs some time to initialize. Feel free to 
>> leave the AppStats recorder on - it doesn't take up much memory and it 
>> might catch some interesting info sooner or later.
>>   
>>  
>> -----------------
>> -Vinny P
>> Technology & Media Advisor
>> Chicago, IL
>>
>> App Engine Code Samples: http://www.learntogoogleit.com
>>   
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to