Hi Ronaldo, I'd probably go with either Barry or Brandon's suggestion to lower the max concurrency or disable the process while you're debugging.
Do you know what is causing the latency to slowly increase? If it is due to loading the remote servers, have you investigated if "batching" will help reduce the load? Or is it on the App Engine side, perhaps your using offsets to get to the next entity to process? Robert 2012/2/12 Ronoaldo José de Lana Pereira <[email protected]> > Some inputs: > > > Latency: > > > <https://www.google.com/chart?chxt=x,y&chd=e:AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,AYAZAYAVAbAZAZAiAZAaAaAbAYAaAfGQAbAbAYAXAZAXAVAcBNF1J4L6I1InLkOrHmGjR.GbTHKIPQSoLaWbFwVwTHOlg0dKd6a7bLZOckbJYYYvV3dlmGlpUBoLnGjRiDSclSgZmuWoM.lJy88ogkr3tDqYbzxWwGcyna0Zp2ys0I2.rizg5A9U0uuT-P&chxp=0,97.7,81.4,65.1,48.9,32.6,16.3,0.0%7C1,20.0,40.0,60.0,80.0,100.0&chxs=&chg=0,20.00,1,2&chco=0077cc&chm=R,7f7f7f,0,0.977,0.979%7CR,7f7f7f,0,0.814,0.816%7CR,7f7f7f,0,0.651,0.653%7CR,7f7f7f,0,0.489,0.490%7CR,7f7f7f,0,0.326,0.327%7CR,7f7f7f,0,0.163,0.164%7CR,7f7f7f,0,0.000,0.001%7CB,eaf0f4,0,0,0&chs=750x185&cht=lxy&chxl=0%3A%7Cnow%7C-1hr%7C-2hr%7C-3hr%7C-4hr%7C-5hr%7C-6hr%7C1%3A%7C30000%7C60000%7C90000%7C120000%7C150000&chls=2,0,0&chdl=Dynamic+Requests&chdlp=t> > Req/Sec: > > > <http://www.google.com/chart?chxt=x,y&chd=e:AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,QOPQQXRIVJV9R8TkXjUgS4SZSlSQT8c9SdSlS0Q-PBODOsQiUQqwxMukyL1-mTk7mJu-lrnwlzhfi9kyiGm-qFhgm9juY.kQiBhbjNjwg5iLnonzq8t2laligzhHdXnEs6m2oiqoncxJm-lPWrkyoGoaoLt3cVl5lOllkDlwo0o7obngr4kPmSjhmmhnf7,AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,CBCeCaCWCMCuCHB.CfCVCXCVCXCWCNCOB4CBC8CtCXB6COCMCbCJB1B4CHCBCkCMCKCWCRCECdB8ChCjCaCSBxCOCCCFB2CPCDB6CuCZB4CiCoCVCLB.CNCkB4CNCPCHCfCCClB0CFC4B3ByCDCiB0BwB2CPB0CICRCJCnCwCMCaCbB9B1CNB7BwB-B8Bp,AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,VDUaYgXpXwVsWfWUVbWdXfVxXETtYNVpXCVqUBaZW9XxWzYBWxUZToQlQvT5YEX1ZeZtWnTeatW-WuX9aHb5ZAXpTNX9V9XfUWU9YrWkVWasY7ZKWIYHXhZcV5Y8XoXXaVVSXVVCX6fkVTWNSma-V.V2XEZQVwXMb.afb0cdYPXSXiUcPwTaSuXdWfW7W4&chxp=0,97.7,81.4,65.1,48.9,32.6,16.3,0.0%7C1,20.0,40.0,60.0,80.0,100.0&chxs=&chg=0,20.00,1,2&chco=0077cc,ff9900,e04e00&chm=R,7f7f7f,0,0.977,0.979%7CR,7f7f7f,0,0.814,0.816%7CR,7f7f7f,0,0.651,0.653%7CR,7f7f7f,0,0.489,0.490%7CR,7f7f7f,0,0.326,0.327%7CR,7f7f7f,0,0.163,0.164%7CR,7f7f7f,0,0.000,0.001%7CB,eaf0f4,0,0,0%7CB,eaf0f4,1,0,0%7CB,eaf0f4,2,0,0&chs=750x185&cht=lxy&chxl=0%3A%7Cnow%7C-1hr%7C-2hr%7C-3hr%7C-4hr%7C-5hr%7C-6hr%7C1%3A%7C6.00%7C12.00%7C18.00%7C24.00%7C30.00&chls=2,0,0%7C2,0,0%7C2,0,0&chdl=Dynamic+Requests%7CStatic+Requests%7CCached+Requests&chdlp=t> > Instances: > > > <http://www.google.com/chart?chxt=x,y&chd=e:AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,AoAqAqAqApAtAvAvAwAvAvAzAlArArBeAvAvAvAtAoAoAlAlApBNB9CjCjDxEGJ2EWFQF3GmHwIEJBJ8KjLuMCN6M3OrPPWYQaQ3SXS7TpUbVuWaW8YQZGa4Z7bocUjvdmePfmgPhch0iSjYkjljmlomm5pPqBxSrIrtsct0uivlwYwvyWzO0w7Vu52i3-,AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,AKAKAKAJANAQAOAOAPAOAMAMALALAQBLAMAOAMAKAKAJAKAJAKBHB9CTC0DxEFHCCeDoFcDUHTDnG7HhGBKOD0MCJCJDPGUfQFQtSPSeTETzUbU.VAX2ZDV5NrZiahbuYTO8c5bYfyQnLOdpkGhAdimgmnh8VbqOhmWQkzo2nwukvHuQn9wOuu3rkWpjzs,AVBBBtCaDGDzEfFMF4GlHRH-IqJXKDKaKwLcMJM1NiOOO6PnQTRARsSZTFTyUeU1VLV3WkXQX9YpZWaCavbbcHc0dgeNe5fQfmgSg.hriYjEjxkdlKl2mjnPn8oopUprqBqtrasGsztfuMu4vlwRw-xqyXzDzw0G0c1J112h3O364n5T6A6s7Z8F8y9e-L,AaAaAaAaAeAfAeAeAfAeAdAgAYAdAgBVAdAdAcAaAaAZAaAZAaBLB9CjCjDxEFHGCrDrFdDdHTD4G7HwGBKWEBMJJCJDPHUfQFQ3SPSeTETzUcU.VAX2ZDWYNrZiahbuYlO8c5bdfyRVLOdpkGhEdinBmniTVbqOhzWXkzo2nwulvHuQoGwPu04GkWpjzy&chxp=0,97.7,81.4,65.1,48.9,32.6,16.3,0.0%7C1,20.0,40.0,60.0,80.0,100.0&chxs=&chg=0,20.00,1,2&chco=0077cc,ff9900,06ac27&chm=R,7f7f7f,0,0.977,0.979%7CR,7f7f7f,0,0.814,0.816%7CR,7f7f7f,0,0.651,0.653%7CR,7f7f7f,0,0.489,0.490%7CR,7f7f7f,0,0.326,0.327%7CR,7f7f7f,0,0.163,0.164%7CR,7f7f7f,0,0.000,0.001%7CB,eaf0f4,0,0,0%7CB,eaf0f4,1,0,0%7CB,eaf0f4,2,0,0&chs=750x185&cht=lxy&chxl=0%3A%7Cnow%7C-1hr%7C-2hr%7C-3hr%7C-4hr%7C-5hr%7C-6hr%7C1%3A%7C500.00%7C1000.00%7C1500%7C2000%7C2500&chls=2,0,0%7C2,0,0%7C2,0,0&chdl=Total%7CActive%7CBilled&chdlp=t> > > Our app usualy, with normal traffic, needs about 40 - 60 instances to run. > Our setup is Java runtime with multithreading disabled (we tried enabling > it but the error rate was too high due to DeadlineExceededExceptions and > HardDeadlineExceededExceptions). Currently, we are on MS, but we are in > progress to migrate to HRD. > > Since Friday, we started an operation to sync 500k contacts with an > external app, and this sync required about 10 API calls to the remote > server (urlfetch calls). The overall operation of sync one contact is slow, > and for some limitations of the remote service, we need to sync each > contacts individually. We started running this sync in a queue with rate of > 1/s. This proven to work and to be extremely slow. > > Today I decided to go faster and configured the queue to run at 20/s with > max_concurrent of 3000, since this is a sunday, with less traffic than > usual, on both our app and the remote service. At this point, there was > around 350k contacts to sync. A few hours later, our app was running with > 2000 instances, the app responding slowly, and still 150k remaining > contacts to sync. > > I'm assuming that I did something very, very very wrong, but don't know > where to start. What I found weird, was that the instance count started to > grow in a weird, strange, unstoppable way while the req/sec was stable. So, > my question: what I did so wrong that cost me around $320 in a few hours!? > > Any tips on how to solve this problem more eficiently? I followed the > suggestion to try doing small things in tasks, so, 1 contact sync (~10 > urlfetch calls + ~5 datastore read ops) = 1 task. > > Thanks in advance for any suggestion, ans sorry for this long post. > > Best Regards, > > -Ronoaldo Pereira > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/google-appengine/-/Kmmd_14YDmQJ. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
