Dylan, thanks for the detailed explanation. Very helpful! Josh, appreciate the hint about combiners.
On Fri, Dec 27, 2013 at 9:53 PM, Dylan Hutchison <[email protected]>wrote: > Hi Arshak, > > Perhaps this might help, though I will defer to Jeremy on the finer > points. Keep in mind that writing methods that interface with the D4M > schema will duplicate some of the Matlab D4M code. Jeremy, is the Java > source for the D4M JARs available? > > First let's weigh potential choices for row IDs: > > 1. row ID = ReadingTimestamp. The advantage is that you can easily > query for continuous periods of time via the row, but the disadvantage is > that all of your writes will go at the end to the last tablet server, > assuming that your data is streaming in live. If your data is not > streaming in live but available offline, you can create table splits to > distribute the time stamps evenly across your tablet servers. Scanning may > or may not bottleneck on one tablet server depending on your use case. > 2. row ID = ReadingTimestamp *reversed*. For example, use > 0005791918831 instead of 1388191975000. Now writes will naturally > distribute across all your tablet servers distribute. However, you lose > the ability to easily query continuous periods of time. > 3. row ID = a new unique ID assigned to each row in your table, > leaving the ReadingTimestamp as another column. This is a good all-around > solution for the regular table, but what about the transpose table? > Because it switches the rows and columns, we still have the disadvantage > in option #1 in the transpose table. > > Since we don't have a clear winner, I wrote a quick implementation of > option #1. Here's the original table, followed by the exploded table, its > transpose and the degree table. > > Original > Load,Machine,Pool, > 1388191975000,005, neptune,west, > 1388191975010,009, neptune,west, > 1388191975090,013, pluto, east, > > Exploded > > Load|005,Load|009,Load|013,Machine|neptune,Machine|pluto,Pool|east,Pool|west, > 1388191975000,1, 1, > 1, > 1388191975010, 1, 1, > 1, > 1388191975090, 1, 1, 1, > > > > Transpose > 1388191975000,1388191975010,1388191975090, > Load|005, 1, > Load|009, 1, > Load|013, 1, > Machine|neptune,1, 1, > Machine|pluto, 1, > Pool|east, 1, > Pool|west, 1, 1, > > > Degree > degree, > Load|005, 1, > Load|009, 1, > Load|013, 1, > Machine|neptune,2, > Machine|pluto, 1, > Pool|east, 1, > Pool|west, 2, > > > *1. Should the transpose table be built as part of ingest code or as an > accumulo combiner?* > I recommend ingest code for much greater simplicity, though it may be > possible to build a combiner to automatically ingest to a second table. > When inserting (row,col,val) triples, do another insert to the transpose > with (col,row,val). > Use summing combiners to create the degree tables. > > *2. What does the degree table do in this example ? The paper mentions > it's useful for query optimization. How? * > Suppose you're interested in querying the row timestamps that log a *load > > 5* (query A) and come from the *east pool *(query B). (You want A & B.) > *I assume that the number of returned rows matching A & B is far less > than the total number of rows.* (If they are roughly equal, you must > query all the rows anyway and a database won't help you.) There are a few > options: > > 1. Scan through all the rows and return only those that match A & B. > Sure, this will do the job, but what if you have 100M rows? We can do > faster with our transpose table. > 2. Scan through the transpose table for the timestamps that match > query A. In our example, this returns 2 rows with loads > 5. Then retain > only the rows that also match query B and come from the East pool. > 3. Scan through the transpose table for the timestamps that match > query B. In our example, this returns 1 row that comes from the East pool. > Then retain only the rows that also match query A and have loads > 5. > > How can we tell whether query strategy #2 or #3 is better? Choose the one > that returns the fewest rows first! We determine that by consulting the > degree table twice (two quick lookups for counts) and verifying that there > are 2 rows from query A and 1 row for query B, thus making strategy #3 the > better choice. Of course, the difference is 1 row in our small example, > but it could be much larger on a real database. > > *3. Does D4M accommodate "repurposing" the row_id to a partition key? > The wikisearch shows how the partition id is important for parallel scans > of the index. But since Accumulo is a row store how can you do fast > lookups by row if you've used the row_id as a partition key.* > Table design is tough when optimizing for the general case. Hopefully the > above tradeoff analysis helps present different options. > > > I attached the CSV file and a few lines of Matlab code to recreate the > tables. If you don't mind reading a little Matlab, take a look at the > tutorial here: > > https://github.com/denine99/d4mBB > > I work out a very similar query optimization solution on baseball data in > the section "%% Find how many players weigh < 200 lb. and bat with left > hand or both hands". > > Cheers, > Dylan Hutchison > > > > On Fri, Dec 27, 2013 at 8:01 PM, Arshak Navruzyan <[email protected]>wrote: > >> Jeremy, >> >> Wow, didn't expect to get help from the author :) >> >> How about something simple like this: >> >> Machine Pool Load ReadingTimestamp >> neptune west 5 1388191975000 >> neptune west 9 1388191975010 >> pluto east 13 1388191975090 >> >> These are the areas I am unclear on: >> >> 1. Should the transpose table be built as part of ingest code or as an >> accumulo combiner? >> 2. What does the degree table do in this example ? The paper mentions >> it's useful for query optimization. How? >> 3. Does D4M accommodate "repurposing" the row_id to a partition key? >> The wikisearch shows how the partition id is important for parallel scans >> of the index. But since Accumulo is a row store how can you do fast >> lookups by row if you've used the row_id as a partition key. >> >> Thank you, >> >> Arshak >> >> >> >> >> >> >> On Thu, Dec 26, 2013 at 5:31 PM, Jeremy Kepner <[email protected]> wrote: >> >>> Hi Arshak, >>> Maybe you can send a few (~3) records of data that you are familiar >>> with >>> and we can walk you through how the D4M schema would be applied to those >>> records. >>> >>> Regards. -Jeremy >>> >>> On Thu, Dec 26, 2013 at 03:10:59PM -0500, Arshak Navruzyan wrote: >>> > Hello, >>> > I am trying to get my head around Accumulo schema designs. I went >>> through >>> > a lot of trouble to get the wikisearch example running but since >>> the data >>> > in protobuf lists, it's not that illustrative (for a newbie). >>> > Would love to find another example that is a little simpler to >>> understand. >>> > In particular I am interested in java/scala code that mimics the >>> D4M >>> > schema design (not a Matlab guy). >>> > Thanks, >>> > Arshak >>> >> >> > > > -- > www.cs.stevens.edu/~dhutchis >
