It works, and the performance is breathtaking : 8.6 million entities (4.3 lines x 2 entities per line) created in 1.5h, using 100 shardsŠ Compared to my previous non-blob-based mapper job, CPU cost remains a little high (190 CPU hours), but I can't complain. Thank you guys.
From: "Ikai Lan (Google)" <[email protected]> Reply-To: <[email protected]> Date: Wed, 17 Nov 2010 16:06:07 -0800 To: <[email protected]> Subject: Re: [appengine-java] Mapper & Blobstore bytes read limit The bug has been fixed. Check out the latest code from the appengine-mapreduce project. Note that the ratio between blobstore bytes read and blob size is not 1:1. In my tests they were closer to 10:1. This is expected behavior for the time being. We're working on more options so users can better tune the behavior. -- Ikai Lan Developer Programs Engineer, Google App Engine Blogger: http://googleappengine.blogspot.com <http://googleappengine.blogspot.com/> Reddit: http://www.reddit.com/r/appengine Twitter: http://twitter.com/app_engine On Wed, Nov 17, 2010 at 2:19 AM, Cyrille Vincey <[email protected]> wrote: > VERY good news. > Can't wait. Thanks. > > From: "Ikai Lan (Google)" <[email protected]> > Reply-To: <[email protected]> > Date: Tue, 16 Nov 2010 12:07:59 -0800 > > To: <[email protected]> > Subject: Re: [appengine-java] Mapper & Blobstore bytes read limit > > We discovered a bug. We're not reading in the entire blob, but we are reading > in far too much data. > > Fred has a fix waiting in the rafters. I'll post again when it's been pushed. > > -- > Ikai Lan > Developer Programs Engineer, Google App Engine > Blogger: http://googleappengine.blogspot.com > <http://googleappengine.blogspot.com/> > Reddit: http://www.reddit.com/r/appengine > Twitter: http://twitter.com/app_engine > > > > On Thu, Nov 4, 2010 at 2:36 AM, Cyrille Vincey <[email protected]> wrote: >> Not a lot of interesting stuff to say : >> 1. My code is quite as simple as your sample code: the only real difference >> is that I create 2 parent/child entities in a row for one given csv line >> entry. >> 2. My csv file contains 4.3 million lines. >> 2. I launched the mapper job with 10 shards. >> 3. "worker-attempt-XXX" tasks had 20 retries each in average. >> 4. The blobstore bytes read quota (100 Go) got reached within the first 3 >> hours. >> 5. Est. 10% of the entities where actually created after 24h (with my >> previous non-blob-based mapper job, those 4.3 million entities where created >> within 1 day) >> 6. Log does not reveal anything interesting. >> >> I am currently running a new test with a 500,000 lines csv file (20 Mb file). >> Performance looks better. To me, blob file size may have an influence on the >> mapper performance. >> >> If you need more details, let me know. >> >> From: "Ikai Lan (Google)" <[email protected]> >> Reply-To: <[email protected]> >> Date: Wed, 3 Nov 2010 12:22:10 -0700 >> To: <[email protected]> >> Subject: Re: [appengine-java] Mapper & Blobstore bytes read limit >> >> This behavior doesn't seem right. No, the entire blob should not be getting >> read. We'll look into this. >> >> Do you have any more details? Could tasks be getting retried? >> >> -- >> Ikai Lan >> Developer Programs Engineer, Google App Engine >> Blogger: http://googleappengine.blogspot.com >> <http://googleappengine.blogspot.com/> >> Reddit: http://www.reddit.com/r/appengine >> Twitter: http://twitter.com/app_engine >> >> >> >> On Tue, Nov 2, 2010 at 9:42 AM, Cyrille Vincey <[email protected]> wrote: >>> I've been testing Ikai's bulkload mapper (see url below) with a pretty big >>> csv file (200 Mb). >>> It works great, and I encourage most of you to consider implementing this >>> for entity uploads. >>> >>> Yet, I do face one last issue with an unexpected quota : blobstore bytes >>> read. >>> This quota cannot be tuned via the billing settings, and it's not clear >>> whether it limits the speed of my process or not when it's reached. >>> >>> >>> See ? Yep, it's a lot of bytes readŠ >>> Could someone confirm that the blob csv file is *NOT* fully fetched each >>> time the mapper iterates on a new line ? >>> >>> (ikai's post) >>> http://ikaisays.com/2010/08/11/using-the-app-engine-mapper-for-bulk-data-imp >>> ort/ >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Google App Engine for Java" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected] >>> <mailto:google-appengine-java%[email protected]> . >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine-java?hl=en. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine for Java" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine-java?hl=en. >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine for Java" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected] >> <mailto:google-appengine-java%[email protected]> . >> For more options, visit this group at >> http://groups.google.com/group/google-appengine-java?hl=en. > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=en. > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected] > <mailto:google-appengine-java%[email protected]> . > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.
<<inline: Capture d¹écran 2010-11-02 à 17.17.25.png>>
