Re: Monitoring long / stuck CTAS

Carol McDonald Fri, 29 May 2015 14:40:24 -0700

What Ted just talked about  is also explained in this On Demand Training

https://www.mapr.com/services/mapr-academy/mapr-distribution-essentials-training-course-on-demand


(which is free)


On Fri, May 29, 2015 at 5:29 PM, Ted Dunning <[email protected]> wrote:

> There are two methods to support HBase table API's.  The first is to simply
> run HBase. That is just like, well, running HBase.
>
> The more interesting alternative is to use a special client API that talks
> a special table-oriented wire protocol to the file system which implements
> a column-family / column oriented table API similar to what HBase uses.
> The big differences have to do with the fact that code inside the file
> system has capabilities available to it that are not available to HBase.
> For instance, it can use a file oriented transaction and recovery system.
> It can also make use of knowledge about file system layout that is not
> available to HBase.
>
> Because we can optimize the file layouts, we can also change the low level
> protocols for disk reorganization.  MapR tables have more levels of
> sub-division than HBase and we use different low-level algorithms.  This
> results in having lots of write-ahead logs which would crush HDFS because
> of the commit rate, but it allows very fast crash recovery (10's to low
> 100's of ms after the basic file system is back)
>
> Also, since the tables are built using standard file-system primitives all
> of the transactionally correct snapshots and mirrors carry over to tables
> as well.
>
> Oh, and it tends to be a lot faster and failure tolerant as well.
>
>
>
> On Fri, May 29, 2015 at 7:00 AM, Yousef Lasi <[email protected]>
> wrote:
>
> > Could you expand on the HBase table integration? How does that work?
> >
> > On Fri, May 29, 2015 at 5:55 AM, Ted Dunning <[email protected]>
> > wrote:
> >
> > >
> > > 4) you get the use of the HBase API without having to run HBase.
> Tables
> > > are integrated directly into MapR FS.
> > >
> > >
> > >
> > >
> > >
> > > On Thu, May 28, 2015 at 9:37 AM, Matt <[email protected]> wrote:
> > >
> > > > I know I can / should assign individual disks to HDFS, but as a test
> > > > cluster there are apps that expect data volumes to work on. A
> dedicated
> > > > Hadoop production cluster would have a disk layout specific to the
> > task.
> > >
> >
>

Re: Monitoring long / stuck CTAS

Reply via email to