Hi Brandon,

Interesting story but you rarely design facebook for 500 millions
people right from the start and alone...

Anyway i would love to know how much it would cost you and how long
you would need to get your data out of your super/big apps.

Please share.

Cheers

On Dec 29, 6:58 pm, "Brandon Wirtz" <[email protected]> wrote:
> If you check the archives I have shared times when my requests were well
> over 5000/s.
>
> I would say GAE handles big data really well. But you have to do testing to
> make sure your structure is correct, and that your indexes are well thought
> out.
>
> Planning is always possible.  Testing is always possible. But like driving
> my Mini Cooper around LeGuna Seca, vs. driving a Ferrari around it.  The
> Ferrari is only faster  if you can handle it. My mom can run laps in the
> mini cooper, but would end up in the wall in a Ferrari.
>
> Or like the discussion about executing code from students.
>
> GAE is cycles on demand, so if you can build your app to be efficient it is
> cheap. If you build it with errors it is expensive.
>
> I recently found I could knock 3% off of my bill by disabling logging.
> That's the level of testing we do.   People say "but how can you afford to
> pay devs to write code if you worry that much"   well we are betting on the
> long haul. We only need to learn the lesson once to capitalize on it for
> years.
>
> You say you can't predict growth. Sure I can. I either engineer something to
> work for me and 3 of my friends, or I engineer it to be the next facebook.
> There is room for some differences along the way, but I could build facebook
> on GAE.  No worry about big data, or scaling. (I think the GAE team would
> deploy servers for me as fast as I could fill them)
>
> Things that are designed for you and your friends you don't market, you
> don't tell people about, so they don't grow.  When we went from CDNinabox
> going from something brandon uses for his sites to being a product, the
> product got lots of complete re-writes. Testing in Java and Python, the
> caching mechanism we use ended up using 4 different models based on the type
> of site traffic the site we are accelerating gets.  1 hack for me became a
> software with 40+ optimizations that can be turned on and off to make things
> run up to 80% cheaper than the defaults. And to pick those settings we test.
> We even schedule changes to test real traffic for periods of time.
>
> I think the real lesson I'm trying to convey is one I learned at MSFT.  For
> every dev there is 1/40th of a CTO, 1/10 of a product manager 2 test
> engineers 1/5 of a release manager, and 1/5 of a performance engineer. That
> is 2.5 support staff for every programmer.  If you are just writing code you
> are working in a vacuum that makes it hard to plan, test, debug, and run
> scalability metrics.
>
>
>
>
>
>
>
> -----Original Message-----
> From: [email protected]
>
> [mailto:[email protected]] On Behalf Of Yohan
> Sent: Thursday, December 29, 2011 2:00 AM
> To: Google App Engine
> Subject: [google-appengine] Re: Cautionary Tale: Abusive price for data
> migration and deletion
>
> Hi Brandon,
>
> Well I started using GAE simply because 2 years ago i was a tech team of 1
> and I couldn't afford to hire full time sysadmins. I'm migrating some of my
> stuff out now that i have more guys to help me. And GAE is a great platform
> that runs on its own and doesn't require much administration (i launched
> games and apps on it that just run for months with no major issues). So
> great for starting up. But as soon as you enter the big data domain, you
> need more control about the way you can process and move your data around
> (the big companies all have their own datacenters because they need full
> control about the
> infrastructure) and thus a PAAS may not be suited anymore.
>
> It's hard to plan that your business will grow 10x within a few months and
> the tech infrastructure must suddenly grow from 50 req/s to 5,000 req/s. BTW
> GAE can't handle such load well (latency of min 500ms on java seriously
> suck, not talking about write contention on the datastore). It is easy to
> plan when everything can be defined in advance (with budgets and stuff) but
> you don't always have the option.
>
> But thanks for sharing your inputs anyway, always appreciated ;)
>
> On Dec 29, 4:31 pm, "Brandon Wirtz" <[email protected]> wrote:
> > Development is not about not making mistakes, it is about doing
> > structured performance testing and cost analysis.
>
> > My team writes 500 lines of code for every 50 that make it in to the
> > final product.
>
> > We know things about the efficiencies of  Do While vs. ForEach that
> > quite possibly Google doesn't even know.  We are that anal about
> > testing.  We test query speed done different way's and compare cost
> > and performance based on the anticipated ratios of use.
>
> > We just never let "mistakes" grow to the point we can't control them.
>
> > -----Original Message-----
> > From: [email protected]
>
> > [mailto:[email protected]] On Behalf Of Raymond
> > Sent: Thursday, December 29, 2011 12:13 AM
> > To: Google App Engine
> > Subject: [google-appengine] Re: Cautionary Tale: Abusive price for
> > data migration and deletion
>
> > Dear Yohan,
>
> > On my side I thank you for sharing your experience, I am beginning
> > with GAE and know that whatever the time I will put on this project I
> > will be making beginner mistakes and this kind of info is precious.
> > I have now a limited experience with GAE and have to compare it with
> > what I know and in some sectors GAE look very bad, for example I can't
> > imagine Oracle, DB2, Informix, etc, ...MsSQL, etc having any
> > commercial success if they would not have implemented rock solid
> > solutions to import and export data, backup, build and drop tables and
> > databases and of course calculate precisely the data space required to
> > build a data structure, in some cases down to the byte.
> > Although I understand the very different nature of GAE compared to
> > this traditional DB engines, I think that any professional developer,
> > IT manager, project manager, or person responsible for budget would
> > feel very uncomfortable building a system without a firm grip on it's
> > costs or a reasonable solution to modify an initial implementation or
> > migrate away from it. Also the fact that part of the GAE tools are
> > simply not reliable enough to be able to plan effort and time required
> > to do something is an other big minus for this solution.
>
> > Although DB's are not my main competence, my very first paid job
> > 20+years ago was to migrate a critical database to a new structure on
> > a new machine (HP 9000 unix), using a long forgotten database engine,
> > the first attempt using SQL took 1 week to migrate, the second using
> > low level C calls took months to develop and migrated in the required
> > 3.5 hours, but the important thing to note is that It never crossed my
> > mind to question the reliability of the machine, the database or the C
> > calls I was making to the DB, it just worked, the Server could be
> > locked for minutes swapping to disk because of lack of memory or
> > overload, but it never failed once and repeated the exercise time and
> > time again, reliably and in a predictable timeframe.
>
> > All this said there are advantages to GAE that are worth fighting with
> > it's limitations, I have not yet found anything else that is so
> > immediately and massively scalable and at the same time does not
> > require me to manage the software and hardware, this is invaluable,
> > and although I know that I could have a easier job moving to MySQL, I
> > just don't want to manage an OS and a DB engine, I don't have the
> > time, I have done it and don't think that's where I am going to earn my
> bacon.
>
> > I will always envy some of the people answering your message for the
> > depth of knowledge they have of this platform and the fact that they
> > always have the right solution and right answer to everything, it must
> > be great to never make mistakes.
>
> > -R
>
> > On Dec 29, 6:25 am, jon <[email protected]> wrote:
> > > Yohan I agree that there should be an easy and cheap way to get your
> > > data out. I think it's a little unfair that leaving GAE is made that
> > > hard.
>
> > > How much did you spend on your custom data download tool? Would you
> > > consider open sourcing it for other developers who are caught in the
> > > same position? I'd hate spending weeks building a custom tool just
> > > to get my data out.
>
> > > Thanks for sharing your experience.
>
> > > On Dec 29, 12:26 am, Yohan <[email protected]> wrote:
>
> > > > Hi Brandon,
>
> > > > Although i agree with you that the original dataset wasnt fully
> > > > optimized (that was over 2 years ago), i believe that i have a
> > > > good understanding of datatore vs SQL, caching etc. Im not
> > > > building public facing website im dealing with private apis and I
> > > > am already stretching memcache and custom built java cache to the
> limits.
>
> > > > I am also not talking about the reasons why im migrating out of GAE.
> > > > The points i highlighted were:
>
> > > > - no easy way to get your data out
> > > > - no cheap way to get your big data out
> > > > - bulk export in python doesn't handle binary/blob data
> > > > - remote api is unstable
> > > > - running database queries using cursors for long period of time
> > > > is unreliable (many times the cursor got reset for some reason or
> > > > the query would return a 0000000 cursor thus screwing 1 week of
> > > > data
> > > > processing)
> > > > - it cost me an arm to delete my data
>
> > > > To answer other questions :
> > > > - of course i thought about migrating the remaining data to a new
> > > > app then alias from the old app to the new one. But it means
> > > > interrupting the service (disable datastore writes) and i cant
> > > > afford that. Plus the remaining data is still quite big.
> > > > - the multi indexes: everytime i changed the data structure i
> > > > would reprocess everything to conform it to the new schema. Im not
> > > > using any framework like objectify or jdo, im working with the raw
> > > > api directly (which is way more elegant)
> > > > - im not criticizing the platform i am criticizing the lack of
> > > > tools to export and the prohibitive cost of manipulating large data
> sets.
> > > > I actually love GAE, it is just not for this kind of dataset thats
> all.
>
> > > > @Brandon : If
>
> ...
>
> read more »

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to