Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread highpointe
This is rather dated. I would love to sew the side by side justification if anyone has made the transition lately. Sent from my iPhone On Aug 30, 2011, at 12:02 AM, Chris Tarnas c...@email.com wrote: Someone with better knowledge than might be interested in helping answer this question

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread highpointe
My bad. Was looking on the date of the link. Not the post. Please ignore. Sent from my iPhone On Aug 30, 2011, at 12:02 AM, Chris Tarnas c...@email.com wrote: Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:

Real time dynamic data and hbase

2011-08-30 Thread highpointe
We are attempting to build what is akin to a CRM (but not). Our backend is an interface in which clients can control the variables of their assets offering; the variables of the template they use act as the over call on the Db. Within the UI, they have the ability to set thresholds for

Re: Real time dynamic data and hbase

2011-08-30 Thread Sonal Goyal
Hi, Can you please give an example or explain in more detail what you are trying to achieve. Best Regards, Sonal Crux: Reporting for HBase https://github.com/sonalgoyal/crux Nube Technologies http://www.nubetech.co http://in.linkedin.com/in/sonalgoyal On Tue, Aug 30, 2011 at 11:59 AM,

Re: Real time dynamic data and hbase

2011-08-30 Thread lohit
2011/8/29 highpointe highpoint...@gmail.com We are attempting to build what is akin to a CRM (but not). Our backend is an interface in which clients can control the variables of their assets offering; the variables of the template they use act as the over call on the Db. Within the UI,

Re: Real time dynamic data and hbase

2011-08-30 Thread high pointe
Thanks for the response Sonal. Here is an example. The client (backend) is online grocers. The front end client is shoppers / consumers. The grocers have an interface that they can log into and enter their inventory. Example: Apple = 9 The front end is simple; two values the users enter:

Re: Real time dynamic data and hbase

2011-08-30 Thread high pointe
Very interesting. Have you deployed such a setup? How is the performance? It seems it would require quite a bit more than commodity hardware. H-p On Tue, Aug 30, 2011 at 12:37 AM, lohit lohit.vijayar...@gmail.com wrote: 2011/8/29 highpointe highpoint...@gmail.com We are attempting to

Re: Real time dynamic data and hbase

2011-08-30 Thread Sonal Goyal
How about a table which has product codegrocerCode as the key, and column family items with qualifier quantity. You can have other column families like grocer(columns name etc) Then you can scan for productCode and return those grocers whose items:quantiy 0. Best Regards, Sonal Crux: Reporting

Re: Real time dynamic data and hbase

2011-08-30 Thread highpointe
Inline. Sent from my iPhone On Aug 30, 2011, at 2:15 AM, Sonal Goyal sonalgoy...@gmail.com wrote: This part I understand. And in essence the would be a key for each product a grocer sells, yes? How about a table which has product codegrocerCode as the key, and column family items with

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Andrew Purtell
Hi Chris, Appreciate your answer on the post. Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or

Re: Real time dynamic data and hbase

2011-08-30 Thread Sonal Goyal
I was talking about concatenating the product code and grocer code. Use that as the rowkey, and put quantity as a column. You can then scan for a particular product code, and return only those rows where quantity is greater than zero. Best Regards, Sonal Crux: Reporting for HBase

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Bernd Fondermann
On Tue, Aug 30, 2011 at 11:47, Andrew Purtell apurt...@apache.org wrote: Hi Chris, Appreciate your answer on the post. Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Sam Seigal
A question inline: On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell apurt...@apache.org wrote: Hi Chris, Appreciate your answer on the post. Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light.

RE: Real time dynamic data and hbase

2011-08-30 Thread Michael Segel
I don't understand why you're having trouble with this. You have a simple geo location search based on zip and then a product and inventory count. I mean its not really geo-spatial because you're searching based on zip code. So you don't need to worry about any sort of geospatial or geodetic

RE: Real time dynamic data and hbase

2011-08-30 Thread Michael Segel
You still need to organize your vendors by delivery zip. Which gets very ugly when you try product code grocerCode. Even doing something like zipproductvendor as your key gets you a lot of rows. This will work, where you have columns for price, qty on hand , sku, etc... The problem gets

Re: Real time dynamic data and hbase

2011-08-30 Thread highpointe
Thank you all for your input. I think I am closer to a solution that will work. Cheers. H-p Sent from my iPhone On Aug 30, 2011, at 4:57 AM, Sonal Goyal sonalgoy...@gmail.com wrote: I was talking about concatenating the product code and grocer code. Use that as the rowkey, and put

Generating FindBugs HTML report from Maven

2011-08-30 Thread Ramkrishna S Vasudevan
Hi I would like to generate a findbugs HTML report for the HBASE code. I have added the plugin in the POM.xml. When i issue mvn findbugs:findbugs am able to get only an XML file. How to get an HTML file. Regards Ram

RE: Hbase copytable and export/import

2011-08-30 Thread Stuti Awasthi
Hi Friends, I referred http://hbase.apache.org/book.html#copytable and I am successful in copying my table within same cluster. I am then trying to copy m table to other Hadoop-Hbase cluster but facing issues in that . I used following command format : ./hbase

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Ryan Rawson
The Hdfs write pipeline is synchronous, so there is no window. On Aug 30, 2011 4:35 AM, Sam Seigal selek...@yahoo.com wrote: A question inline: On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell apurt...@apache.org wrote: Hi Chris, Appreciate your answer on the post. Personally speaking

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Andrew Purtell
Is the replication strategy for HBase completely reliant on HDFS' block replication pipelining ? Yes. Is this replication process asynchronous ? No. Best regards,    - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Ryan Rawson
I really like the theory of operation stuff. People say that centralized operation is a flaw, but I say it's a strength. In a single datacenter, you have extremely fast .1ms ping or less, there is no need for a fully decentralized architecture - it can be really hard to debug. -ryan On Tue, Aug

multiple insert in hbase

2011-08-30 Thread sriram
After the reducer phase the table.put(put) part insert content twice in the table 1)as actual key value 2)actual key and value as empty string. Can anyone give me as solution to this issue

RE: Hbase copytable and export/import

2011-08-30 Thread Stuti Awasthi
Hi , I tried debugging this and got to know that in my master hbase cluster 41181 port is not opened. I tried telnet to this port but connection refused. Following are the steps did : Start Hadoop Start Hbase Created a table via hbase shell List the contents of table via hbase shell Tried

Re: Hbase copytable and export/import

2011-08-30 Thread Jean-Daniel Cryans
Your remote cluster reports the location of -ROOT- as on localhost on port 41181: 11/08/30 19:27:47 DEBUG client.HConnectionManager$HConnectionImplementation: Lookedup root region location,

[RFC] ORM over HBase + Solr via Content Repo

2011-08-30 Thread Imran M Yousuf
Hi, We have been working on a CMS over HBase in conjunction with Solr for Full-text Search for our internal product. Now that it is at a stage we can share with others, we would like to have some feedback from you. Home Page: http://kenai.com/projects/smart-cms/pages/Home Pages Relevant to ORM:

Re: multiple insert in hbase

2011-08-30 Thread Jean-Daniel Cryans
Well, it doesn't. I mean, AFAIK there's no bug in the code that does that, so it's probably your code. Wild guess, maybe you are using your own HTable to put plus you emit a Put in the reducer? J-D On Tue, Aug 30, 2011 at 9:21 AM, sriram rsriram...@gmail.com wrote: After the reducer phase the

Re: CopyTable

2011-08-30 Thread Jean-Daniel Cryans
It's all optional. J-D On Mon, Aug 29, 2011 at 10:56 PM, Steinmaurer Thomas thomas.steinmau...@scch.at wrote: Hi Doug, thanks, but the given example is exactly the one I provided in the link. I don't use Replication yet, so I wonder what the rs.class properties should be ... Or is this just

Re: HBase Scan returns fewer columns after a few minutes of insertion

2011-08-30 Thread Jean-Daniel Cryans
If you want to limit the number of rows you can instead set the caching to exactly what you need, or set a stop row. J-D On Mon, Aug 29, 2011 at 11:38 PM, Neerja Bhatnagar neerja...@gmail.com wrote: Hi J-D, Thank you very much! Hopefully, this iteration clears it up for me. The batchSize is

Re: multiple insert in hbase

2011-08-30 Thread sriram
Yes you are right. What should i do?

initTableReducerJob cannot find symbol

2011-08-30 Thread sriram
tmp.java:595: cannot find symbol symbol : method initTableReducerJob(java.lang.String,java.lang.Class org.myorg.WordCount.IntSumReducer, org.apache.hadoop.mapreduce.Job) location: class org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil TableMapReduceUtil.initTableReducerJob(table,

Re: multiple insert in hbase

2011-08-30 Thread Jean-Daniel Cryans
Put only one, either the HTable or the output format. J-D On Tue, Aug 30, 2011 at 10:56 AM, sriram rsriram...@gmail.com wrote: Yes you are right. What should i do?

Re: multiple insert in hbase

2011-08-30 Thread sriram
Configuration config= context.getConfiguration(); HTable table = new HTable(config,index); context.write(key, new Text(toReturn.toString())); Enumeration e = temp.keys(); Put put = new Put(Bytes.toBytes(key.toString())); while(e.hasMoreElements()) {

Re: multiple insert in hbase

2011-08-30 Thread sriram
This is another erroer shown tmp.java:595: cannot find symbol symbol : method initTableReducerJob(java.lang.String,java.lang.Class org.myorg.WordCount.IntSumReducer, org.apache.hadoop.mapreduce.Job) location: class org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Sam Seigal
Will the write call to HBase block until the record written is fully replicated ? If not (since it is happening at the block level), then isn't there a window where a region server goes down, the data might not be available anywhere else, until it comes back up ? On Tue, Aug 30, 2011 at 9:17 AM,

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Joseph Boyd
On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal selek...@yahoo.com wrote: Will the write call to HBase block until the record written is fully replicated ? no. data isn't written to disk immediately If not (since it is happening at the block level), then isn't there a window where a region

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Ryan Rawson
While data is not fsynced to disk immediately, it is acked by 3 different nodes (Assuming r=3) before HBase acks the client. -ryan On Tue, Aug 30, 2011 at 1:04 PM, Joseph Boyd joseph.b...@cbsinteractive.com wrote: On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal selek...@yahoo.com wrote: Will the

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Joe Pallas
On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: Better to focus on improving HBase than play whack a mole. Absolutely. So let's talk about improving HBase. I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months. HBase supports

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Ryan Rawson
On Tue, Aug 30, 2011 at 10:42 AM, Joe Pallas joseph.pal...@oracle.com wrote: On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: Better to focus on improving HBase than play whack a mole. Absolutely.  So let's talk about improving HBase.  I'm speaking here as someone who has been learning

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Andrew Purtell
Hi Chris, Would you mind if I paraphrase your responses on StackOverflow? Go right ahead. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) From: Chris Tarnas c...@email.com To: Andrew

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Andrew Purtell
Hi Joe, HBase supports replication between clusters (i.e. data centers). That’s … debatable.  There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.”

Re: HBase Meetup during Hadoop World NYC '11

2011-08-30 Thread Todd Lipcon
I haven't gotten many responses so far. If there doesn't seem to be much interest, I may not spend the time to organize. If you're feeling too busy to answer the full survey, feel free to just reply with a +1 so I know there's some interest! -Todd On Fri, Aug 26, 2011 at 3:33 PM, Todd Lipcon

Re: HBase Meetup during Hadoop World NYC '11

2011-08-30 Thread Blake Matheny
+1 On Tue, Aug 30, 2011 at 11:21 PM, Todd Lipcon t...@cloudera.com wrote: I haven't gotten many responses so far. If there doesn't seem to be much interest, I may not spend the time to organize. If you're feeling too busy to answer the full survey, feel free to just reply with a +1 so I know

Re: HBase Meetup during Hadoop World NYC '11

2011-08-30 Thread BlueDavy Lin
+1 2011/8/31 Todd Lipcon t...@cloudera.com: I haven't gotten many responses so far. If there doesn't seem to be much interest, I may not spend the time to organize. If you're feeling too busy to answer the full survey, feel free to just reply with a +1 so I know there's some interest!

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Andrew Purtell
Will the write call to HBase block until the record written is fully replicated ? At the HDFS layer, hflush on the write ahead log will block until the data is fully replicated. At the HBase layer, whether the writer (client) will be blocked until HDFS layer actions complete depends on your

Re: HBase Meetup during Hadoop World NYC '11

2011-08-30 Thread highpointe
+1 Sent from my iPhone On Aug 30, 2011, at 9:21 PM, Todd Lipcon t...@cloudera.com wrote: I haven't gotten many responses so far. If there doesn't seem to be much interest, I may not spend the time to organize. If you're feeling too busy to answer the full survey, feel free to just reply

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Andrew Purtell
Will the write call to HBase block until the record written is fully replicated ? no. data isn't written to disk immediately Not so black and white. Full replication in HDFS != writes to disk. Full replication is acknowledgement there are replicas at all DataNodes in the pipeline, and with

RE: Hbase copytable and export/import

2011-08-30 Thread Stuti Awasthi
Hi Jean, I will try to explain more clearly. I have 2 cluster each single node Hadoop and Hbase cluster. I can create tables and perform other operation of Hbase in both the cluster individually. Now I wanted to copy a table from 1 cluster (say A) to other cluster (say B). In both the

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Time Less
Most of your points are dead-on. Cassandra is no less complex than HBase. All of this complexity is hidden in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. Operationally, however, HBase is more complex.

Re: HBase and Cassandra on StackOverflow

2011-08-30 Thread Time Less
If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior to HBase/HDFS with respect to consistency and availability My understanding is that R=1 does not guarantee that you won't see different versions of the data in different reads, in some scenarios. There was an