Here's what I don't get -- how is this different than if I allocated a different table for each separate value of the leading field? If I did that and used the second field as the leading prefix instead, I know no one would argue that it's a key that won't distribute well. I don't plan on doing this because it would be too many tables, but I don't really see a fundamental difference between the two approaches.
As an experiment though, let's say I did exactly that. In that case, hashing the keys accomplishes virtually nothing since it spreads the values I'm asking for across a range of possibilities that's no larger, more dispersed, or less ordered than if I used the identifiers directly. That make sense? On Tue, Sep 4, 2012 at 11:04 PM, Michael Segel <[email protected]>wrote: > Uhm... > > This isn't very good. > In terms of inserting, you will hit a single or small subset of regions. > > This may not be that bad if you have enough data and the rows not all > inserting in to the same region. > > since you're hitting an index to pull rows one at a time, you could do > this... if you know the exact record you want, you could hash the key and > then you wouldn't have a problem of hot spotting. > > > On Sep 4, 2012, at 1:51 PM, Eric Czech <[email protected]> wrote: > > > How does the data flow in to the system? One source at a time? > > Generally, it will be one source at a time where these rows are index > entries built from MapReduce jobs > > > > The second field. Is it sequential? > > No, the index writes from the MapReduce jobs should dump some relatively > small number of rows into HBase for each first field - second field > combination but then move on to another first field - second field > combination where the new second field is not ordered in any way relative > to the old second field. > > > > How are you using the data when you pull it from the database? > > Not totally sure what specific use cases you might be asking after but > in a more general sense, the indexed data will power our web platform (we > aggregate and manage data for the music industry) as well as work as inputs > to offline analytics processes. I'm placing the design priority on the > interaction with the web platform though, and the full row structure I'm > intending to use is: > > > > > > > > This is similar to OpenTSDB and the service we provide is similar to > what OpenTSDB was designed for, if that gives you a better sense of what > I'd like to do with the data. > > > > On Tue, Sep 4, 2012 at 2:03 PM, Michael Segel <[email protected]> > wrote: > > Eric, > > > > So here's the larger question... > > How does the data flow in to the system? One source at a time? > > > > The second field. Is it sequential? If not sequential, is it going to be > some sort of incremental larger than a previous value? (Are you always > inserting to the left side of the queue? > > > > How are you using the data when you pull it from the database? > > > > 'Hot spotting' may be unavoidable and depending on other factors, it may > be a moot point. > > > > > > On Sep 4, 2012, at 12:56 PM, Eric Czech <[email protected]> wrote: > > > > > Longer term .. what's really going to happen is more like I'll have a > first > > > field value of 1, 2, and maybe 3. I won't know 4 - 10 for a while and > > > the *second > > > *value after each initial value will be, although highly unique, > relatively > > > exclusive for a given first value. This means that even if I didn't > use > > > the leading prefix, I'd have more or less the same problem where all > the > > > writes are going to the same region when I introduce a new set of > second > > > values. > > > > > > In case the generalities are confusing, the prefix value is a data > source > > > identifier and the second value is an identifier for entities within > that > > > source. The entity identifiers for a given source are likely to span > > > different numeric or alpha-numeric ranges, but they probably won't be > the > > > same ranges across sources. Also, I won't know all those ranges (or > > > sources for that matter) upfront. > > > > > > I'm concerned about the introduction of a new data source (= leading > prefix > > > value) since the first writes will be to the same region and ideally > I'd be > > > able to get a sense of how the second values are split for the new > leading > > > prefix and split an HBase region to reflect that. If that's not > possible > > > or just turns out to be a pain, then I can live with the introduction > of > > > the new prefix being a little slow until the regions split and > distribute > > > effectively. > > > > > > That make sense? > > > > > > On Tue, Sep 4, 2012 at 1:34 PM, Michael Segel < > [email protected]>wrote: > > > > > >> I think you have to understand what happens as a table splits. > > >> If you have a composite key where the first field has the value > between > > >> 0-9 and you pre-split your table, you will have all of your 1's > going to > > >> the single region until it splits. But both splits will start on the > same > > >> node until they eventually get balanced out. > > >> > > >> (Note: I'm not an expert on how hbase balances the regions across a > region > > >> server so I couldn't tell you how it choses which nodes to place each > > >> region.) > > >> > > >> But what are you trying to do? Avoid a hot spot on the initial load, > or > > >> are you looking at the longer term picture? > > >> > > >> > > >> > > >> On Sep 3, 2012, at 2:58 PM, Eric Czech <[email protected]> wrote: > > >> > > >>> With regards to: > > >>> > > >>>> If you have 3 region servers and your data is evenly distributed, > that > > >>>> mean all the data starting with a 1 will be on server 1, and so on. > > >>> > > >>> Assuming there are multiple regions in existence for each prefix, why > > >>> would they not be distributed across all the machines? > > >>> > > >>> In other words, if there are many regions with keys that generally > > >>> start with 1, why would they ALL be on server 1 like you said? It's > > >>> my understanding that the regions aren't placed around the cluster > > >>> according to the range of information they contain so I'm not quite > > >>> following that explanation. > > >>> > > >>> Putting the higher cardinality values in front of the key isn't > > >>> entirely out of the question, but I'd like to use the low cardinality > > >>> key out front for the sake of selecting rows for MapReduce jobs. > > >>> Otherwise, I always have to scan the full table for each job. > > >>> > > >>> On Mon, Sep 3, 2012 at 3:20 PM, Jean-Marc Spaggiari > > >>> <[email protected]> wrote: > > >>>> Yes, you're right, but again, it will depend on the number of > > >>>> regionservers and the distribution of your data. > > >>>> > > >>>> If you have 3 region servers and your data is evenly distributed, > that > > >>>> mean all the data starting with a 1 will be on server 1, and so on. > > >>>> > > >>>> So if you write a million of lines starting with a 1, they will all > > >>>> land on the same server. > > >>>> > > >>>> Of course, you can pre-split your table. Like 1a to 1z and assign > each > > >>>> region to one of you 3 servers. That way you will avoir hotspotting > > >>>> even if you write million of lines starting with a 1. > > >>>> > > >>>> If you have une hundred regions, you will face the same issue at the > > >>>> beginning, but the more data your will add, the more your table will > > >>>> be split across all the servers and the less hotspottig you will > have. > > >>>> > > >>>> Can't you just revert your fields and put the 1 to 30 at the end of > the > > >> key? > > >>>> > > >>>> 2012/9/3, Eric Czech <[email protected]>: > > >>>>> Thanks for the response Jean-Marc! > > >>>>> > > >>>>> I understand what you're saying but in a more extreme case, let's > say > > >>>>> I'm choosing the leading number on the range 1 - 3 instead of 1 - > 30. > > >>>>> In that case, it seems like all of the data for any one prefix > would > > >>>>> already be split well across the cluster and as long as the second > > >>>>> value isn't written sequentially, there wouldn't be an issue. > > >>>>> > > >>>>> Is my reasoning there flawed at all? > > >>>>> > > >>>>> On Mon, Sep 3, 2012 at 2:31 PM, Jean-Marc Spaggiari > > >>>>> <[email protected]> wrote: > > >>>>>> Hi Eric, > > >>>>>> > > >>>>>> In HBase, data is stored sequentially based on the key > alphabetical > > >>>>>> order. > > >>>>>> > > >>>>>> It will depend of the number of reqions and regionservers you > have but > > >>>>>> if you write data from 23AAAAAA to 23ZZZZZZ they will most > probably go > > >>>>>> to the same region even if the cardinality of the 2nd part of the > key > > >>>>>> is high. > > >>>>>> > > >>>>>> If the first number is always changing between 1 and 30 for each > > >>>>>> write, then you will reach multiple region/servers if you have, > else, > > >>>>>> you might have some hot-stopping. > > >>>>>> > > >>>>>> JM > > >>>>>> > > >>>>>> 2012/9/3, Eric Czech <[email protected]>: > > >>>>>>> Hi everyone, > > >>>>>>> > > >>>>>>> I was curious whether or not I should expect any write hot spots > if I > > >>>>>>> structured my composite keys in a way such that the first field > is a > > >>>>>>> low cardinality (maybe 30 distinct values) value and the next > field > > >>>>>>> contains a very high cardinality value that would not be written > > >>>>>>> sequentially. > > >>>>>>> > > >>>>>>> More concisely, I want to do this: > > >>>>>>> > > >>>>>>> Given one number between 1 and 30, write many millions of rows > with > > >>>>>>> keys like <number chosen> : <some generally distinct, > non-sequential > > >>>>>>> value> > > >>>>>>> > > >>>>>>> Would there be any problem with the millions of writes happening > with > > >>>>>>> the same first field key prefix even if the second field is > largely > > >>>>>>> unique? > > >>>>>>> > > >>>>>>> Thank you! > > >>>>>>> > > >>>>> > > >>> > > >> > > >> > > > > > >
