I just replaced the native filesystem based solution by HDFS without introducing any additional servers, And it works perfectly in combination with encryption of files. For the POC this is sufficient.
I think I have spend more time on typing emails today then on switching to HDFS. Thanks! On Wed, Jan 5, 2011 at 5:06 PM, Peter Veentjer <[email protected]>wrote: > > > On Wed, Jan 5, 2011 at 5:00 PM, Friso van Vollenhoven < > [email protected]> wrote: > >> I guess so. >> >> HBase actually has quite a strong consistency model. > > > It depends on how consistency is defined. HBase supports no repeatable > reads because there is no concept of transaction, so every time you do a > read you get a different result. For STM this would be called extremely low > consistency. There are higher levels of consistency like 'snapshot' > consistency where your reads are not only repeatable but also are causal > consistent. And then of course there is the serialized isolation level where > even writeskews are prevented. > > >> Thing is, that it is just row level. Multi row transactions would require >> multiple locks and some kind of commit / roll back solution. Have you had a >> look at Google's percolator paper? >> > > Not yet. I'll check it our. > > >> >> >> Friso >> >> >> >> On 5 jan 2011, at 16:49, Peter Veentjer wrote: >> >> > I also want to see if an STM like Multiverse can be aligned with NoSQL >> > solutions like HBase. But to do that, I first need to get more hands on >> > experience with NoSQL solutions. >> > >> > On Wed, Jan 5, 2011 at 4:34 PM, Peter Veentjer <[email protected] >> >wrote: >> > >> >> >> >> >> >> On Wed, Jan 5, 2011 at 4:03 PM, Friso van Vollenhoven < >> >> [email protected]> wrote: >> >> >> >>> Hi Peter, >> >>> >> >>> Do you mean you want to use the HDFS that HBase relies on for other >> things >> >>> and not just exclusively HBase? That should be just fine. We do it all >> the >> >>> time. >> >>> >> >>> >> >> Ok thanks. >> >> >> >> >> >> >> >>> Are you worried about putting to much load on it? >> >> >> >> >> >> For the POC it won't matter that much. I can get my stuff up and >> running. >> >> >> >> >> >>> I guess that depends on the type of work load that you have and what >> you >> >>> do with it. But generally I think it is nice to have all nodes be the >> same >> >>> (so all workers are datanode and region server), such that you don't >> have to >> >>> scale out them separately. >> >>> >> >> >> >>>> Peter, are you based in The Netherlands by any chance? There is a >> NoSQL >> >> meetup group in NL (http://www.meetup.com/nosql-nl/) with >>meetups >> every >> >> now and then. Next one is at January 24 and is all about HBase. We're >> doing >> >> a on the spot install on a number of present >>laptops to create a >> temporary >> >> cluster and play around with it. I have been working with Hadoop and >> HBase >> >> for the past couple of months, so if >>you care to come by, I'd be >> happy to >> >> share some experiences. >> >> >> >> Yet I live in Holland. I'm a former Xebia employee :) I think I'll >> visit >> >> one of the nosql meetups. >> >> >> >> We are building a kind of application server where instead of providing >> >> services like JMS, Servlet, EJB's etc we are providing services for >> secured >> >> document storage, message exchange, semantic analysis of documents etc. >> It >> >> is all based on GigaSpaces but I have the impression (after working >> more >> >> than a year with it) that is is very time consuming to get right. Apart >> from >> >> all the correctness issues (and there where/are many.. based on bad >> usage of >> >> GigaSpaces and architectural choices) there are also some >> >> performance/scalability issues that need solving. >> >> >> >> So I decided to rewrite the main use cases using HBase. I had most of >> the >> >> functionality up and running in a few days and most of the 'bad >> >> architectural choices' we are going to remove in the next 6 months are >> not >> >> there from the beginning (e.g. using streams instead of byte arrays for >> >> document processing.. how stupid can you be). It also was a nice >> exercise to >> >> play with HBase and less consistent solutions. >> >> >> >> I normally work on realizing very high consistency for Multiverse: >> >> >> >> http://multiverse.codehaus.org >> >> >> >> So I want to have some hands on experience with using less consistent >> >> solutions. >> >> >> >> >> >>> >> >>> Friso >> >>> >> >>> >> >>> >> >>> On 5 jan 2011, at 14:41, Peter Veentjer wrote: >> >>> >> >>>> Hi Guys, >> >>>> >> >>>> I'm currently writing a POC based on hbase and I spend more time on >> >>> writing >> >>>> a ui than on writing the hbase functionality. So I'm very excited >> about >> >>>> exploring HBase further and doing some serious performance and >> >>> scalability >> >>>> tests and see if we can use it as core technology instead of the >> >>>> time/resource intensive Gigaspaces. >> >>>> >> >>>> My question: >> >>>> >> >>>> I'm currently using HBase and I also want to use the HDFS directly to >> >>> store >> >>>> files. If the HBase server(s) is installed, can I directly access the >> >>> HDFS >> >>>> of these servers or is it better to set up a seperate Hadoop server >> for >> >>>> running HDFS. >> >>> >> >>> >> >> >> >> >
