As Stack and Andrew said, just wanted to give you fair warning that this
mode may need some love. Likewise, there are probably alternative that run
a bit lighter weight, though you flatter us with the reminder of the long
feature list.

I have no problem with helping to fix and committing fixes to bugs that
crop up in local mode operations. Bring 'em on!

-n

On Tue, Mar 10, 2015 at 3:56 PM, Alex Baranau <[email protected]>
wrote:

> On:
>
> - Future investment in a design that scales better
>
> Indeed, designing against key value store is different from designing
> against RDBMs.
>
> I wonder if you explored an option to abstract the storage layer and using
> "single node purposed" store until you grow enough to switch to another
> one?
>
> E.g. you could use LevelDB [1] that is pretty fast (and there's java
> rewrite of it, if you need java APIs [2]). We use it in CDAP [3] in a
> standalone version to make the development environment (SDK) lighter. We
> swap it with HBase in distributed mode without changing the application
> code. It doesn't have coprocessors and other specific to HBase features you
> are talking about, though. But you can figure out how to bridge client APIs
> with an abstraction layer (e.g. we have common Table interface [4]). You
> can even add versions on cells (see [5] for example of how we do it).
>
> Also, you could use RDBMs behind key-value abstraction, to start with,
> while keeping your app design clean out of RDBMs specifics.
>
> Alex Baranau
>
> [1] https://github.com/google/leveldb
> [2] https://github.com/dain/leveldb
> [3] http://cdap.io
> [4]
>
> https://github.com/caskdata/cdap/blob/develop/cdap-api/src/main/java/co/cask/cdap/api/dataset/table/Table.java
> [5]
>
> https://github.com/caskdata/cdap/blob/develop/cdap-data-fabric/src/main/java/co/cask/cdap/data2/dataset2/lib/table/leveldb/LevelDBTableCore.java
>
> --
> http://cdap.io - open source framework to build and run data applications
> on Hadoop & HBase
>
> On Tue, Mar 10, 2015 at 8:42 AM, Rose, Joseph <
> [email protected]> wrote:
>
> > Sorry, never answered your question about versions. I have 1.0.0 version
> > of hbase, which has hadoop-common 2.5.1 in its lib folder.
> >
> >
> > -j
> >
> >
> > On 3/10/15, 11:36 AM, "Rose, Joseph" <[email protected]>
> > wrote:
> >
> > >I tried it and it does work now. It looks like the interface for
> > >hadoop.fs.Syncable changed in March, 2012 to remove the deprecated
> sync()
> > >method and define only hsync() instead. The same committer did the right
> > >thing and removed sync() from FSDataOutputStream at the same time. The
> > >remaining hsync() method calls flush() if the underlying stream doesn't
> > >implement Syncable.
> > >
> > >
> > >-j
> > >
> > >
> > >On 3/6/15, 5:24 PM, "Stack" <[email protected]> wrote:
> > >
> > >>On Fri, Mar 6, 2015 at 1:50 PM, Rose, Joseph <
> > >>[email protected]> wrote:
> > >>
> > >>> I think the final issue with hadoop-common (re: unimplemented sync
> for
> > >>> local filesystems) is the one showstopper for us. We have to have
> > >>>assured
> > >>> durability. I¹m willing to devote some cycles to get it done, so
> maybe
> > >>>I¹m
> > >>> the one that says this problem is worthwhile.
> > >>>
> > >>>
> > >>I remember that was once the case but looking in codebase now, sync
> calls
> > >>through to ProtobufLogWriter which does a 'flush' on output (though
> > >>comment
> > >>says this is a noop). The output stream is an instance of
> > >>FSDataOutputStream made with a RawLOS. The flush should come out here:
> > >>
> > >>220     public void flush() throws IOException { fos.flush(); }
> > >>
> > >>... where fos is an instance of FileOutputStream.
> > >>
> > >>In sync we go on to call hflush which looks like it calls flush again.
> > >>
> > >>What hadoop/hbase versions we talking about? HADOOP-8861 added the
> above
> > >>behavior for hadoop 1.2.
> > >>
> > >>Try it I'd say.
> > >>
> > >>St.Ack
> > >
> >
> >
>

Reply via email to