[google-appengine] Re: dev_appserver hangs while running tests

Shaunak Godbole Thu, 11 May 2017 20:51:36 -0700

That is very weird indeed.  

To be fair, the tests suddenly stopped working one day and since then we've 
been seeing the tests hang every time.


I'm going to try and upgrade my SDK version and see if that helps.  I'm 
also going to try it on a different machine to see if it works there as 
well.

Thanks!
Shaunak

On Thursday, May 11, 2017 at 4:05:36 PM UTC-7, Nick (Cloud Platform 
Support) wrote:
>
> Hey Shaunak,
>
> Interesting indeed! My SDK version is: 
>
> $ goapp version
> go version 1.6.3 (appengine-1.9.53) linux/amd64
>
> Downgrading to 1.9.48 didn't cause the issue to appear, either. I'm 
> running an Intel(R) Xeon(R) CPU @ 2.2 GHz, and I only have 7.3G memory!
>
> I'm running the exact code you'd specified above, as well... Is it 
> possible that your CPU and memory are being quite used-up by other 
> processes?
>
> Cheers,
>
> Nick
> Cloud Platform Community Support
>
> On Thursday, May 11, 2017 at 6:45:20 PM UTC-4, Shaunak Godbole wrote:
>>
>> That's interesting.
>>
>> goapp version prints the following:
>>
>> go version go1.6.3 (appengine-1.9.48) darwin/amd64
>>
>>
>>
>> I'm using MacBookPro with the following configuration:
>> Processor: 2.2 GHz Intel Core i7 
>> Memory 16G
>> Graphics Intel Iris Pro 1536MB
>>
>> Thanks!
>> Shaunak
>>
>>
>> On Thursday, May 11, 2017 at 3:19:06 PM UTC-7, Nick (Cloud Platform 
>> Support) wrote:
>>>
>>> Hey Shaunak,
>>>
>>> I've not been able to observe the crash as you saw. For me, it completes 
>>> in a timely manner. I'm curious to know what version of the SDK you're 
>>> using now and what are the specs of the machine you're running this on?
>>>
>>> Cheers,
>>>
>>> Nick
>>> Cloud Platform Community Support
>>>
>>> On Thursday, May 11, 2017 at 10:32:50 AM UTC-4, Shaunak Godbole wrote:
>>>>
>>>> Thanks Nick, 
>>>>
>>>> I was able to isolate the problem significantly.  It seems like there 
>>>> is a limit on the number of operations that can be read in a given amount 
>>>> of time.  If you run the following program:
>>>>
>>>> package main
>>>>
>>>> import (
>>>> "fmt"
>>>> "testing"
>>>>
>>>> "google.golang.org/appengine"
>>>> "google.golang.org/appengine/aetest"
>>>> "google.golang.org/appengine/datastore"
>>>> )
>>>>
>>>> type Entity struct {
>>>>   Id int64
>>>> }
>>>>
>>>> func TestLoad(t *testing.T) {
>>>>         opts := aetest.Options{StronglyConsistentDatastore: true}
>>>>         inst, _ := aetest.NewInstance(opts)
>>>> defer inst.Close()
>>>>
>>>>
>>>> for i := 0; i < 10000; i++ {
>>>> req, _ := inst.NewRequest("GET", "/", nil)
>>>> ctx := appengine.NewContext(req)
>>>>
>>>> k := datastore.NewKey(ctx, "Entity", "", 12345, nil)
>>>> var entity Entity
>>>> datastore.Get(ctx, k, &entity)
>>>> fmt.Println("Iteration Count ", i+1)
>>>> ctx.Done()
>>>> }
>>>> }
>>>>
>>>> This program gets stuck at Iteration count 240.  We suspect that 
>>>> aetest, or appengine go library releases its hold on some resources only 
>>>> after ctx.deadline expires.  This code seems to pass through if we change 
>>>> the context deadline to be 50 ms.  Increasing the deadline still makes the 
>>>> program pass through but after 240th iteration, we see each subsequent 
>>>> step 
>>>> taking roughly the same time as deadline.
>>>>
>>>> We also added code to see the number of goroutines that get added as 
>>>> part of the each of the steps and can see that 2 goroutines are added at 
>>>> every step and the number keeps on increasing until step number 240.  Our 
>>>> guess is that these goroutines are holding up the resources.  Is there any 
>>>> way to determine which goroutines are these?
>>>>
>>>> Thanks,
>>>> Shaunak
>>>>
>>>>
>>>> On Wednesday, May 10, 2017 at 1:13:04 PM UTC-7, Nick (Cloud Platform 
>>>> Support) wrote:
>>>>>
>>>>> Hey Shaunak,
>>>>>
>>>>> You might want to run a set of tests across the parameter space of all 
>>>>> the relevant variables:
>>>>>
>>>>> Vary the total number of tests:
>>>>>
>>>>> N = number of tests total
>>>>>
>>>>>
>>>>> Vary the proportion of N taken up by each type of test:
>>>>>
>>>>> N_a = number of tests of type a
>>>>> N_b = number of tests of type b
>>>>>
>>>>> . . . 
>>>>>
>>>>> N_e = number of tests of type e
>>>>>
>>>>> . . . 
>>>>>
>>>>>
>>>>> Vary the parameters of the tests themselves:
>>>>>
>>>>> a_x = critical parameter x involved in running a test of type a
>>>>> a_y = critical parameter y involved in running a test of type a
>>>>> . . .
>>>>> e_x = critical parameter x involved in running a test of type e
>>>>>
>>>>> . . . 
>>>>>
>>>>>
>>>>> You would script a series of runs of test suites defined by their 
>>>>> position in "test suite parameter space" (N, N_a, N_b, . . . , N_e, . . . 
>>>>> , 
>>>>> a_x, a_y, . . . ,  e_x, . . .), programmatically logging whether the test 
>>>>> suite completed or crashed, how long it took before it crashed if so, and 
>>>>> you could possibly observe correlations between crashing and certain 
>>>>> kinds 
>>>>> of tests and/or rule out which variables seem to have no impact on 
>>>>> whether 
>>>>> it crashes or not.
>>>>>
>>>>> * * *
>>>>>
>>>>> Beyond that, though, it seems you'd need to either go source-code 
>>>>> diving or get the assistance of someone who actually works on the 
>>>>> codebase 
>>>>> of the go SDK and aetest. You might want to file an issue in the github 
>>>>> issue tracker for aetest <https://github.com/golang/appengine> as 
>>>>> soon as you can get more info from a more rigorous test of test suite 
>>>>> parameters as described above.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Nick
>>>>> Cloud Platform Community Support
>>>>>
>>>>>
>>>>> On Wednesday, May 10, 2017 at 1:07:39 PM UTC-4, Shaunak Godbole wrote:
>>>>>>
>>>>>> Hello, we are using aetest to write some integration tests as part of 
>>>>>> our project.  The integration tests spin up an instance of a 
>>>>>> dev_appserver.py in a *BeforeSuite* function.  We use APIs to 
>>>>>> guarantee that the suite executes before we start the test:
>>>>>>
>>>>>> func BeforeSuite() {
>>>>>>   params := aetest.InstanceParams{StrongConsistency: true}
>>>>>>   inst := aetest.NewInstance(params)
>>>>>>   req := aetest.NewRequest("Get", "/", nil)
>>>>>>   ctx := appengine.NewContext(req)
>>>>>> }
>>>>>>
>>>>>> We have a similar *AfterSuite* that tears down the instance. 
>>>>>>
>>>>>> func AfterSuite() {
>>>>>>   inst.Close()
>>>>>> }
>>>>>>
>>>>>> Since we want the tests to be completely hermetic, we delete all the 
>>>>>> entities that we had created during the tests, usually in an 
>>>>>> *AfterEach* method.
>>>>>>
>>>>>> func AfterEach() {
>>>>>>   deleteAllEntitiesFromDatastore("ENTITY_1")
>>>>>>   deleteAllEntitiesFromDatastore("ENTITY_2")
>>>>>>   deleteAllEntitiesFromDatastore("ENTITY_3")
>>>>>>   ...
>>>>>> }
>>>>>>
>>>>>> And, the we have a lot of test cases that simply run against the 
>>>>>> created instance and use the context acquired through *BeforeSuite*. 
>>>>>>  We have a good mixture of Gets, Puts, Entity Queries, Filter Queries 
>>>>>> etc. 
>>>>>>  Since a few days, every-time we run our tests, the tests hang at some 
>>>>>> point during the execution.  This point changes as we add and remove 
>>>>>> some 
>>>>>> of the tests, but the tests always hang if the number of suites are 
>>>>>> above a 
>>>>>> certain threshold.  We usually see errors like:
>>>>>>
>>>>>>     <string>: Call error 11: Deadline exceeded (timeout)
>>>>>>
>>>>>>
>>>>>> or like:
>>>>>>
>>>>>> Expected success, but got an error:
>>>>>>     <*url.Error | 0xc8210570b0>: {
>>>>>>         Op: "Post",
>>>>>>         URL: "http://localhost:59072";,
>>>>>>         Err: {s: "EOF"},
>>>>>>     }
>>>>>>     Post http://localhost:59072: EOF
>>>>>>
>>>>>>
>>>>>> While the tests are hung, if we try to access the datastore through 
>>>>>> the ip produced in the tests, the request just hangs similar to how it 
>>>>>> hangs during the tests.  This makes us suspicious that there is some 
>>>>>> kind 
>>>>>> of a deadlock that is getting introduced during our execution, but we 
>>>>>> are 
>>>>>> not able to figure out how to debug it.  
>>>>>>
>>>>>> Also, the test suites have a timeout after which the tests simply 
>>>>>> crash.  At this point, the AfterSuite cannot execute so the test-server 
>>>>>> remains active in the background, and when we try to access datastore 
>>>>>> through the admin, we can see all the data that was added as part of the 
>>>>>> test.
>>>>>>
>>>>>> For completeness sake, we also use gomocks to abstract out some 
>>>>>> external services as part of the tests.
>>>>>>
>>>>>> Any help on debugging this would be very much appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>> Shaunak
>>>>>>
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/5ead3615-f543-487e-bfe6-e2974f1da5b8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[google-appengine] Re: dev_appserver hangs while running tests

Reply via email to