Hi Brandon, Well I started using GAE simply because 2 years ago i was a tech team of 1 and I couldn't afford to hire full time sysadmins. I'm migrating some of my stuff out now that i have more guys to help me. And GAE is a great platform that runs on its own and doesn't require much administration (i launched games and apps on it that just run for months with no major issues). So great for starting up. But as soon as you enter the big data domain, you need more control about the way you can process and move your data around (the big companies all have their own datacenters because they need full control about the infrastructure) and thus a PAAS may not be suited anymore.
It's hard to plan that your business will grow 10x within a few months and the tech infrastructure must suddenly grow from 50 req/s to 5,000 req/s. BTW GAE can't handle such load well (latency of min 500ms on java seriously suck, not talking about write contention on the datastore). It is easy to plan when everything can be defined in advance (with budgets and stuff) but you don't always have the option. But thanks for sharing your inputs anyway, always appreciated ;) On Dec 29, 4:31 pm, "Brandon Wirtz" <[email protected]> wrote: > Development is not about not making mistakes, it is about doing structured > performance testing and cost analysis. > > My team writes 500 lines of code for every 50 that make it in to the final > product. > > We know things about the efficiencies of Do While vs. ForEach that quite > possibly Google doesn't even know. We are that anal about testing. We test > query speed done different way's and compare cost and performance based on > the anticipated ratios of use. > > We just never let "mistakes" grow to the point we can't control them. > > > > > > > > -----Original Message----- > From: [email protected] > > [mailto:[email protected]] On Behalf Of Raymond > Sent: Thursday, December 29, 2011 12:13 AM > To: Google App Engine > Subject: [google-appengine] Re: Cautionary Tale: Abusive price for data > migration and deletion > > Dear Yohan, > > On my side I thank you for sharing your experience, I am beginning with GAE > and know that whatever the time I will put on this project I will be making > beginner mistakes and this kind of info is precious. > I have now a limited experience with GAE and have to compare it with what I > know and in some sectors GAE look very bad, for example I can't imagine > Oracle, DB2, Informix, etc, ...MsSQL, etc having any commercial success if > they would not have implemented rock solid solutions to import and export > data, backup, build and drop tables and databases and of course calculate > precisely the data space required to build a data structure, in some cases > down to the byte. > Although I understand the very different nature of GAE compared to this > traditional DB engines, I think that any professional developer, IT manager, > project manager, or person responsible for budget would feel very > uncomfortable building a system without a firm grip on it's costs or a > reasonable solution to modify an initial implementation or migrate away from > it. Also the fact that part of the GAE tools are simply not reliable enough > to be able to plan effort and time required to do something is an other big > minus for this solution. > > Although DB's are not my main competence, my very first paid job > 20+years ago was to migrate a critical database to a new structure on > a new machine (HP 9000 unix), using a long forgotten database engine, the > first attempt using SQL took 1 week to migrate, the second using low level C > calls took months to develop and migrated in the required > 3.5 hours, but the important thing to note is that It never crossed my mind > to question the reliability of the machine, the database or the C calls I > was making to the DB, it just worked, the Server could be locked for minutes > swapping to disk because of lack of memory or overload, but it never failed > once and repeated the exercise time and time again, reliably and in a > predictable timeframe. > > All this said there are advantages to GAE that are worth fighting with it's > limitations, I have not yet found anything else that is so immediately and > massively scalable and at the same time does not require me to manage the > software and hardware, this is invaluable, and although I know that I could > have a easier job moving to MySQL, I just don't want to manage an OS and a > DB engine, I don't have the time, I have done it and don't think that's > where I am going to earn my bacon. > > I will always envy some of the people answering your message for the depth > of knowledge they have of this platform and the fact that they always have > the right solution and right answer to everything, it must be great to never > make mistakes. > > -R > > On Dec 29, 6:25 am, jon <[email protected]> wrote: > > Yohan I agree that there should be an easy and cheap way to get your > > data out. I think it's a little unfair that leaving GAE is made that > > hard. > > > How much did you spend on your custom data download tool? Would you > > consider open sourcing it for other developers who are caught in the > > same position? I'd hate spending weeks building a custom tool just to > > get my data out. > > > Thanks for sharing your experience. > > > On Dec 29, 12:26 am, Yohan <[email protected]> wrote: > > > > Hi Brandon, > > > > Although i agree with you that the original dataset wasnt fully > > > optimized (that was over 2 years ago), i believe that i have a good > > > understanding of datatore vs SQL, caching etc. Im not building > > > public facing website im dealing with private apis and I am already > > > stretching memcache and custom built java cache to the limits. > > > > I am also not talking about the reasons why im migrating out of GAE. > > > The points i highlighted were: > > > > - no easy way to get your data out > > > - no cheap way to get your big data out > > > - bulk export in python doesn't handle binary/blob data > > > - remote api is unstable > > > - running database queries using cursors for long period of time is > > > unreliable (many times the cursor got reset for some reason or the > > > query would return a 0000000 cursor thus screwing 1 week of data > > > processing) > > > - it cost me an arm to delete my data > > > > To answer other questions : > > > - of course i thought about migrating the remaining data to a new > > > app then alias from the old app to the new one. But it means > > > interrupting the service (disable datastore writes) and i cant > > > afford that. Plus the remaining data is still quite big. > > > - the multi indexes: everytime i changed the data structure i would > > > reprocess everything to conform it to the new schema. Im not using > > > any framework like objectify or jdo, im working with the raw api > > > directly (which is way more elegant) > > > - im not criticizing the platform i am criticizing the lack of tools > > > to export and the prohibitive cost of manipulating large data sets. > > > I actually love GAE, it is just not for this kind of dataset thats all. > > > > @Brandon : If you have a way to delete 2 billions entities (whatever > > > their size) on the cheap please let me know. > > > > On Dec 28, 8:48 pm, Leandro Rezende <[email protected]> wrote: > > > > > u pay to write, pay to keep it stored... delete should be free. > > > > > 2011/12/28 Brandon Wirtz <[email protected]> > > > > > > Yes, **** > > > > > > While the primary app I talk about is edge Cache, that’s because > > > > > that’s the thing that people can most benefit from that people > > > > > don’t seem to be > > > > > using.**** > > > > > > ** ** > > > > > > As part of my SEO tools we have what is now a 60 TB database of > > > > > Backlinks and Crawler data about websites in the top 200k Alexa > > > > > Sites. **** > > > > > > ** ** > > > > > > Why should Deleting be Cheaper? The Operation takes the same > > > > > amount of CPU, and after you do the delete you don’t have to pay > > > > > for storage.**** > > > > > > ** ** > > > > > > I don’t do near as much in the Java Space but it doesn’t seem > > > > > there should be much difference between Python and Java. I > > > > > ported both the primary apps to both languages to do comparative > > > > > cost analysis, and there have been a few things that we found > > > > > were faster or cheaper with one or the other, as a result in > > > > > some case we deploy both and use different versioning so they > > > > > can both be live and attached to the same data.**** > > > > > > ** ** > > > > > > ** ** > > > > > > *From:* [email protected] [mailto: > > > > > [email protected]] *On Behalf Of *André Pankraz > > > > > *Sent:* Wednesday, December 28, 2011 12:06 AM > > > > > *To:* [email protected] > > > > > *Cc:* [email protected] > > > > > *Subject:* [google-appengine] Re: Cautionary Tale: Abusive price > > > > > for data migration and deletion**** > > > > > > ** ** > > > > > > Sry Brandon...he has a point - deleting data should be cheaper, > > > > > even if it's technically the same like writing. > > > > > Maybe he made some mistakes but you sometimes sound like a > > > > > fanboy with GAE stockholm syndrome. ;) See what I did > here...annoying accusations. > > > > > You have very good experience with Python, Cache stuff, Edge > > > > > cache etc., but do you really have experience with multiple 100 > > > > > GB datastore to talk like this? > > > > > E.g.: I have also seen some answers from you (often very > > > > > helpful) that are just plain wrong in the Java environment.**** > > > > > > -- > > > > > You received this message because you are subscribed to the > > > > >Google Groups "Google App Engine" group. > > > > > To view this discussion on the web visit > > > > >https://groups.google.com/d/msg/google-appengine/-/oJRZxuV7yQgJ. > > > > > To post to this group, send email to > [email protected]. > > > > > To unsubscribe from this group, send email to > > > > > [email protected]. > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/google-appengine?hl=en.**** > > > > > > -- > > > > > You received this message because you are subscribed to the > > > > > Google Groups "Google App Engine" group. > > > > > To post to this group, send email to > [email protected]. > > > > > To unsubscribe from this group, send email to > > > > > [email protected]. > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/google-appengine?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group > athttp://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
