What Ted just talked about is also explained in this On Demand Training https://www.mapr.com/services/mapr-academy/mapr-distribution-essentials-training-course-on-demand
(which is free) On Fri, May 29, 2015 at 5:29 PM, Ted Dunning <[email protected]> wrote: > There are two methods to support HBase table API's. The first is to simply > run HBase. That is just like, well, running HBase. > > The more interesting alternative is to use a special client API that talks > a special table-oriented wire protocol to the file system which implements > a column-family / column oriented table API similar to what HBase uses. > The big differences have to do with the fact that code inside the file > system has capabilities available to it that are not available to HBase. > For instance, it can use a file oriented transaction and recovery system. > It can also make use of knowledge about file system layout that is not > available to HBase. > > Because we can optimize the file layouts, we can also change the low level > protocols for disk reorganization. MapR tables have more levels of > sub-division than HBase and we use different low-level algorithms. This > results in having lots of write-ahead logs which would crush HDFS because > of the commit rate, but it allows very fast crash recovery (10's to low > 100's of ms after the basic file system is back) > > Also, since the tables are built using standard file-system primitives all > of the transactionally correct snapshots and mirrors carry over to tables > as well. > > Oh, and it tends to be a lot faster and failure tolerant as well. > > > > On Fri, May 29, 2015 at 7:00 AM, Yousef Lasi <[email protected]> > wrote: > > > Could you expand on the HBase table integration? How does that work? > > > > On Fri, May 29, 2015 at 5:55 AM, Ted Dunning <[email protected]> > > wrote: > > > > > > > > 4) you get the use of the HBase API without having to run HBase. > Tables > > > are integrated directly into MapR FS. > > > > > > > > > > > > > > > > > > On Thu, May 28, 2015 at 9:37 AM, Matt <[email protected]> wrote: > > > > > > > I know I can / should assign individual disks to HDFS, but as a test > > > > cluster there are apps that expect data volumes to work on. A > dedicated > > > > Hadoop production cluster would have a disk layout specific to the > > task. > > > > > >
