> Sounds like a step toward using a block pool directly and avoiding the filesystem layer (Hadoop 2+).
This has come up previously. With federation, we should be able to embed NN as a first cut, and own all the blocks in the hbase namespace. Enis On Fri, Mar 8, 2013 at 11:32 AM, Sergey Shelukhin <[email protected]>wrote: > +1. > That gives us a lot of freedom to do stuff in many scenarios. > > On Thu, Mar 7, 2013 at 5:42 PM, Andrew Purtell <[email protected]> > wrote: > > > > also, if instead of files you think about handling blocks directly you > > can end up doing more stuff, like a proper compaction that require less > I/O > > if N blocks are not changed, some crazy deduplication on tables with same > > content & similar... > > > > Sounds like a step toward using a block pool directly and avoiding the > > filesystem layer (Hadoop 2+). > > > > > > On Fri, Mar 8, 2013 at 7:36 AM, Matteo Bertozzi <[email protected] > > >wrote: > > > > > sure having the hardlink support > > > (HDFS-3370<https://issues.apache.org/jira/browse/HDFS-3370>) > > > solve the HFileLink hack > > > but you still need to add extra metadata for splits (reference files) > > > > > > also, if instead of files you think about handling blocks directly > > > you can end up doing more stuff, like a proper compaction that > > > require less I/O if N blocks are not changed, some crazy deduplication > > > on tables with same content & similar... > > > > > > On Thu, Mar 7, 2013 at 11:22 PM, Sergey Shelukhin < > > [email protected] > > > >wrote: > > > > > > > Hmm... ranges sounds good, but for files, it would be nice if there > > were > > > a > > > > hardlink mechanism. > > > > It should be trivial to do in HDFS if blocks could belong to several > > > files. > > > > Then we don't have to have private cleanup code. > > > > > > > > On Thu, Mar 7, 2013 at 2:28 PM, Matteo Bertozzi < > > [email protected] > > > > >wrote: > > > > > > > > > This is seems to going in a super messy direction. > > > > > With HBASE-7806 the ideas was to cleanup all this crazy stuff > > > (HFileLink, > > > > > References, ...) > > > > > > > > > > unfortunately the initial decision of tight together the fs layout > > > > > and the tables/regions/families is bringing to all this workaround > to > > > > have > > > > > something cool. > > > > > > > > > > If you put the files in one place, and the association in another > > you > > > > can > > > > > avoid all this complexity. > > > > > > > > > > /hbase/data/[file1, file 2, file 3, file N] > > > > > > > > > > table 1/region 1: [file 2] > > > > > table 1/region 2: [file 1 (from 0 to 50)] > > > > > table 1/region 3: [file 1 (from 50 to 100)] > > > > > table 2/region 1: [file 1, file 2] > > > > > > > > > > On Thu, Mar 7, 2013 at 10:13 PM, Stack <[email protected]> wrote: > > > > > > > > > > > Yes. That is a few trips to the NN listing directory contents > and > > > then > > > > > > some edits/reading of .META. We would have to introduce a > > > QuarterHFile > > > > > to > > > > > > go with our HalfHFile (or rename HalfHFile as PieceO'HFile). > > > > > > > > > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > >
