Dear Yohan, On my side I thank you for sharing your experience, I am beginning with GAE and know that whatever the time I will put on this project I will be making beginner mistakes and this kind of info is precious. I have now a limited experience with GAE and have to compare it with what I know and in some sectors GAE look very bad, for example I can't imagine Oracle, DB2, Informix, etc, ...MsSQL, etc having any commercial success if they would not have implemented rock solid solutions to import and export data, backup, build and drop tables and databases and of course calculate precisely the data space required to build a data structure, in some cases down to the byte. Although I understand the very different nature of GAE compared to this traditional DB engines, I think that any professional developer, IT manager, project manager, or person responsible for budget would feel very uncomfortable building a system without a firm grip on it's costs or a reasonable solution to modify an initial implementation or migrate away from it. Also the fact that part of the GAE tools are simply not reliable enough to be able to plan effort and time required to do something is an other big minus for this solution.
Although DB's are not my main competence, my very first paid job 20+years ago was to migrate a critical database to a new structure on a new machine (HP 9000 unix), using a long forgotten database engine, the first attempt using SQL took 1 week to migrate, the second using low level C calls took months to develop and migrated in the required 3.5 hours, but the important thing to note is that It never crossed my mind to question the reliability of the machine, the database or the C calls I was making to the DB, it just worked, the Server could be locked for minutes swapping to disk because of lack of memory or overload, but it never failed once and repeated the exercise time and time again, reliably and in a predictable timeframe. All this said there are advantages to GAE that are worth fighting with it's limitations, I have not yet found anything else that is so immediately and massively scalable and at the same time does not require me to manage the software and hardware, this is invaluable, and although I know that I could have a easier job moving to MySQL, I just don't want to manage an OS and a DB engine, I don't have the time, I have done it and don't think that's where I am going to earn my bacon. I will always envy some of the people answering your message for the depth of knowledge they have of this platform and the fact that they always have the right solution and right answer to everything, it must be great to never make mistakes. -R On Dec 29, 6:25 am, jon <[email protected]> wrote: > Yohan I agree that there should be an easy and cheap way to get your > data out. I think it's a little unfair that leaving GAE is made that > hard. > > How much did you spend on your custom data download tool? Would you > consider open sourcing it for other developers who are caught in the > same position? I'd hate spending weeks building a custom tool just to > get my data out. > > Thanks for sharing your experience. > > On Dec 29, 12:26 am, Yohan <[email protected]> wrote: > > > > > > > > > Hi Brandon, > > > Although i agree with you that the original dataset wasnt fully > > optimized (that was over 2 years ago), i believe that i have a good > > understanding of datatore vs SQL, caching etc. Im not building public > > facing website im dealing with private apis and I am already > > stretching memcache and custom built java cache to the limits. > > > I am also not talking about the reasons why im migrating out of GAE. > > The points i highlighted were: > > > - no easy way to get your data out > > - no cheap way to get your big data out > > - bulk export in python doesn't handle binary/blob data > > - remote api is unstable > > - running database queries using cursors for long period of time is > > unreliable (many times the cursor got reset for some reason or the > > query would return a 0000000 cursor thus screwing 1 week of data > > processing) > > - it cost me an arm to delete my data > > > To answer other questions : > > - of course i thought about migrating the remaining data to a new app > > then alias from the old app to the new one. But it means interrupting > > the service (disable datastore writes) and i cant afford that. Plus > > the remaining data is still quite big. > > - the multi indexes: everytime i changed the data structure i would > > reprocess everything to conform it to the new schema. Im not using any > > framework like objectify or jdo, im working with the raw api directly > > (which is way more elegant) > > - im not criticizing the platform i am criticizing the lack of tools > > to export and the prohibitive cost of manipulating large data sets. I > > actually love GAE, it is just not for this kind of dataset thats all. > > > @Brandon : If you have a way to delete 2 billions entities (whatever > > their size) on the cheap please let me know. > > > On Dec 28, 8:48 pm, Leandro Rezende <[email protected]> wrote: > > > > u pay to write, pay to keep it stored... delete should be free. > > > > 2011/12/28 Brandon Wirtz <[email protected]> > > > > > Yes, **** > > > > > While the primary app I talk about is edge Cache, that’s because that’s > > > > the thing that people can most benefit from that people don’t seem to be > > > > using.**** > > > > > ** ** > > > > > As part of my SEO tools we have what is now a 60 TB database of > > > > Backlinks > > > > and Crawler data about websites in the top 200k Alexa Sites. **** > > > > > ** ** > > > > > Why should Deleting be Cheaper? The Operation takes the same amount of > > > > CPU, and after you do the delete you don’t have to pay for storage.**** > > > > > ** ** > > > > > I don’t do near as much in the Java Space but it doesn’t seem there > > > > should > > > > be much difference between Python and Java. I ported both the primary > > > > apps > > > > to both languages to do comparative cost analysis, and there have been a > > > > few things that we found were faster or cheaper with one or the other, > > > > as a > > > > result in some case we deploy both and use different versioning so they > > > > can > > > > both be live and attached to the same data.**** > > > > > ** ** > > > > > ** ** > > > > > *From:* [email protected] [mailto: > > > > [email protected]] *On Behalf Of *André Pankraz > > > > *Sent:* Wednesday, December 28, 2011 12:06 AM > > > > *To:* [email protected] > > > > *Cc:* [email protected] > > > > *Subject:* [google-appengine] Re: Cautionary Tale: Abusive price for > > > > data > > > > migration and deletion**** > > > > > ** ** > > > > > Sry Brandon...he has a point - deleting data should be cheaper, even if > > > > it's technically the same like writing. > > > > Maybe he made some mistakes but you sometimes sound like a fanboy with > > > > GAE > > > > stockholm syndrome. ;) See what I did here...annoying accusations. > > > > You have very good experience with Python, Cache stuff, Edge cache etc., > > > > but do you really have experience with multiple 100 GB datastore to > > > > talk > > > > like this? > > > > E.g.: I have also seen some answers from you (often very helpful) that > > > > are > > > > just plain wrong in the Java environment.**** > > > > > -- > > > > You received this message because you are subscribed to the Google > > > > Groups > > > > "Google App Engine" group. > > > > To view this discussion on the web visit > > > >https://groups.google.com/d/msg/google-appengine/-/oJRZxuV7yQgJ. > > > > To post to this group, send email to [email protected]. > > > > To unsubscribe from this group, send email to > > > > [email protected]. > > > > For more options, visit this group at > > > >http://groups.google.com/group/google-appengine?hl=en.**** > > > > > -- > > > > You received this message because you are subscribed to the Google > > > > Groups > > > > "Google App Engine" group. > > > > To post to this group, send email to [email protected]. > > > > To unsubscribe from this group, send email to > > > > [email protected]. > > > > For more options, visit this group at > > > >http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
