Okay, looks like these whitepapers are at research.google.com now:

http://research.google.com/archive/bigtable.html

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com



On Thu, Feb 2, 2012 at 7:03 PM, Ikai Lan (Google) <[email protected]> wrote:

> Robert, I'll see what I can do. No promises on an ETA. It isn't in one of
> the white papers?
>
> http://labs.google.com/papers/bigtable.html
>
> Oh what the heck ... the link is broken. Let me see what's up.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App Engine
> plus.ikailan.com
>
>
>
> On Thu, Feb 2, 2012 at 1:56 PM, Robert Kluin <[email protected]>wrote:
>
>> Yeah Ikai is completely correct.  I should have noted more clearly
>> that this is not something I even waste time worrying about until I
>> think I'm actually hitting it, which is not often.  In the few cases
>> where I do think I've bumped into it, it is a writing thousands of
>> entities per second type of thing -- which is not very common.
>>
>> It is interesting that sharding is determined by access patterns.  Is
>> that something you can elaborate on at all?  ;)
>>
>>
>> Robert
>>
>>
>>
>> On Thu, Feb 2, 2012 at 16:14, Ikai Lan (Google) <[email protected]>
>> wrote:
>> > Thanks for the answers, Robert.
>> >
>> > Shard size isn't determined by amount of data, but by access patterns.
>> An
>> > example of an anti-pattern that will cause a shard size imbalance would
>> be
>> > an entity write every time a user takes an action - but you never do
>> > anything with this data. Since the data just kind of accumulates, the
>> shard
>> > never splits (unless it hits some hardware bound, which I've never
>> really
>> > seen happen yet with GAE data).
>> >
>> > As a final note, it takes a LOT of writes before this sort of thing
>> happens,
>> > and I sometimes regret writing that blog post because anytime you write
>> a
>> > blog post about scalability patterns, it invites people to prematurely
>> > implement them (Brett Slatkin's video generated an endless number of
>> > questions from people doing sub 1 QPS). We've done launches on the
>> > YouTube/Google homepage
>> > (
>> http://blog.golang.org/2011/12/from-zero-to-go-launching-on-google.html)
>> > that haven't required us to make these changes because they did fine
>> under
>> > load testing. I'd invest more energy in figuring out the right way to
>> load
>> > test, then trying to figure out the bottlenecks when you hit limits with
>> > real data.
>> >
>> > --
>> > Ikai Lan
>> > Developer Programs Engineer, Google App Engine
>> > plus.ikailan.com
>> >
>> >
>> >
>> > On Wed, Feb 1, 2012 at 9:19 PM, Robert Kluin <[email protected]>
>> wrote:
>> >>
>> >> So I'd say don't worry about it unless you actually hit this problem.
>> >> If you do know you'll hit it, see if you have a way to "shard" the
>> >> timestamp, by account, user, or region, etc..., to relieve some of the
>> >> pressure.  If you must have a global timestamp, I'd say keep it as
>> >> simple as possible, until you hit the issue.  At that point you can
>> >> figure out a fix.
>> >>
>> >> When I have timestamps on high write-rate entities that are
>> >> non-critical, for example "expiration" times that are used only for
>> >> cleanup, I'll sometimes add a random jitter of several hours to spread
>> >> the writes out a bit.  I'd be surprised if changing it by a few
>> >> seconds helped much -- but it could.  Keep in mind, there will already
>> >> be some degree of randomness since the instance clocks have some
>> >> slight variation.  If you're hitting this issue, I'd give it a shot
>> >> though.  If it works it could at least buy you some time to get a
>> >> better fix.
>> >>
>> >> I don't think there is a fixed number of rows per shard.  I think it
>> >> is split up by data size, and I don't think the exact number is
>> >> publicly documented.  Maybe you can roughly figure it out via
>> >> experimentation.
>> >>
>> >>
>> >> Robert
>> >>
>> >>
>> >> On Wed, Feb 1, 2012 at 02:28, WGuerlich <[email protected]> wrote:
>> >> > I know, I'm going to hit the write limit with a timestamp I need to
>> >> > update
>> >> > on every write and which needs to be indexed.
>> >> >
>> >> > As an alternative to sharding: What do you think about adding time
>> >> > jitter to
>> >> > the timestamp, that is, changing time randomly by a couple seconds?
>> In
>> >> > my
>> >> > application the timestamp being off by a couple senconds wouldn't
>> pose a
>> >> > problem.
>> >> >
>> >> > Now what I need to know is: How many index entries can I expect to go
>> >> > into
>> >> > one tablet? This is needed to estimate the amount of jitter
>> necessary to
>> >> > avoid hitting the same tablet on every write.
>> >> >
>> >> > Any insights on this?
>> >> >
>> >> > Wolfram
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> > Groups
>> >> > "Google App Engine" group.
>> >> > To view this discussion on the web visit
>> >> > https://groups.google.com/d/msg/google-appengine/-/r0SVTq6i4iEJ.
>> >> >
>> >> > To post to this group, send email to
>> [email protected].
>> >> > To unsubscribe from this group, send email to
>> >> > [email protected].
>> >> > For more options, visit this group at
>> >> > http://groups.google.com/group/google-appengine?hl=en.
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google
>> Groups
>> >> "Google App Engine" group.
>> >> To post to this group, send email to [email protected]
>> .
>> >> To unsubscribe from this group, send email to
>> >> [email protected].
>> >> For more options, visit this group at
>> >> http://groups.google.com/group/google-appengine?hl=en.
>> >>
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> > "Google App Engine" group.
>> > To post to this group, send email to [email protected].
>> > To unsubscribe from this group, send email to
>> > [email protected].
>> > For more options, visit this group at
>> > http://groups.google.com/group/google-appengine?hl=en.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to