Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-23 Thread brady2
Hello, 

I am having the exact same issue. Opentsdb is creating a huge .tmp file and
is runs out of space under after ingesting a similar amount of data. 

Could you post the solution please? 

Many thanks 
John 



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068530.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: data partitioning and data model

2015-02-23 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Thanks Alok, 

I will take a good look at the link for sure. 

Just an additional question, I saw, reading this: 
http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
That HBase can rebalance data inside region servers to keep cluster balanced. 
Does this happen also when using pre-loading?

In the case of a rebalance, if I try to WRITE data to a record being 
rebalanced, would the write performance be affected? 

Best regards,
Marcelo Valle.

From: user@hbase.apache.org 
Subject: Re: data partitioning and data model

You don't want a lot of columns in a write heavy table. HBase stores
the row key along with each cell/column (Though old, I find this
still useful: 
http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
 Having a lot of columns will amplify the amount of data being stored.

That said, if there are only going to be a handful of alert_ids for a
given user_id+timestamp row key, then you should be ok.

The query Select * from table where user_id = X and timestamp  T and
(alert_id = id1 or alert_id = id2) can be accomplished with either
design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.

Alok

On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
mvallemil...@bloomberg.net wrote:
 Hi Alok,

 Thanks for the answer. Yes, I have read this section, but it was a little too 
 abstract for me, I think I was needing to check my understanding. Your answer 
 helped me to confirm I am on the right path, thanks for that.

 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?

 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/

 But I wonder which option would be better for my case. It seems column scans 
 are not so fast as row scans, but what would be the advantages of one design 
 over the other?

 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data

 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)

 Would I be able to do the same query using user_id + timestamp + alert_id as 
 row key?

 Also, I know Cassandra supports up to 2 billion columns per row (2 billion 
 rows per partition in CQL), do you know what's the limit for HBase?

 Best regards,
 Marcelo Valle.

 From: aloksi...@gmail.com
 Subject: Re: data partitioning and data model

 You can use a key like (user_id + timestamp + alert_id) to get
 clustering of rows related to a user. To get better write throughput
 and distribution over the cluster, you could pre-split the table and
 use a consistent hash of the user_id as a row key prefix.

 Have you looked at the rowkey design section in the hbase book :
 http://hbase.apache.org/book.html#rowkey.design

 Alok

 On Fri, Feb 20, 2015 at 8:49 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hello,

 This is my first message in this mailing list, I just subscribed.

 I have been using Cassandra for the last few years and now I am trying to 
 create a POC using HBase. Therefore, I am reading the HBase docs but it's 
 been really hard to find how HBase behaves in some situations, when compared 
 to Cassandra. I thought maybe it was a good idea to ask here, as people in 
 this list might know the differences better than anyone else.

 What I want to do is creating a simple application optimized for writes (not 
 interested in HBase / Cassandra product comparisions here, I am assuming I 
 will use HBase and that's it, just wanna understand the best way of doing it 
 in HBase world). I want to be able to write alerts to the cluster, where 
 each alert would have columns like:
 - alert id
 - user id
 - date/time
 - alert data

 Later, I want to search for alerts per user, so my main query could be 
 considered to be something like:
 Select * from alerts where user_id = $id and date/time  10 days ago.

 I want to decide the data model for my application.

 Here are my questions:

 - In Cassandra, I would partition by user + day, as some users can have many 
 alerts and some just 1 or a few. In hbase, assuming all alerts for a user 
 would always fit in a single partition / region, can I just use user_id as 
 my row key and assume data will be distributed along the cluster?

 - Suppose I want to write 100 000 rows from a client machine and these are 
 from 30 000 users. What's the best manner to write these if I want to 
 optimize for writes? Should I batch all 100 k requests in one to a single 
 server? As I am trying to optimize for writes, I would like to split these 
 requests across several nodes instead of sending them all to one. I found 
 this article: 
 

Re: data partitioning and data model

2015-02-23 Thread Marcelo Valle (BLOOMBERG/ LONDON)
I am sorry, consider I am using auto pre-splitting for question bellow.

From: user@hbase.apache.org 
Subject: Re: data partitioning and data model

Thanks Alok, 

I will take a good look at the link for sure. 

Just an additional question, I saw, reading this: 
http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
That HBase can rebalance data inside region servers to keep cluster balanced. 
Does this happen also when using pre-loading?

In the case of a rebalance, if I try to WRITE data to a record being 
rebalanced, would the write performance be affected? 

Best regards,
Marcelo Valle.

From: user@hbase.apache.org 
Subject: Re: data partitioning and data model

You don't want a lot of columns in a write heavy table. HBase stores
the row key along with each cell/column (Though old, I find this
still useful: 
http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
 Having a lot of columns will amplify the amount of data being stored.

That said, if there are only going to be a handful of alert_ids for a
given user_id+timestamp row key, then you should be ok.

The query Select * from table where user_id = X and timestamp  T and
(alert_id = id1 or alert_id = id2) can be accomplished with either
design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.

Alok

On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
mvallemil...@bloomberg.net wrote:
 Hi Alok,

 Thanks for the answer. Yes, I have read this section, but it was a little too 
 abstract for me, I think I was needing to check my understanding. Your answer 
 helped me to confirm I am on the right path, thanks for that.

 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?

 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/

 But I wonder which option would be better for my case. It seems column scans 
 are not so fast as row scans, but what would be the advantages of one design 
 over the other?

 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data

 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)

 Would I be able to do the same query using user_id + timestamp + alert_id as 
 row key?

 Also, I know Cassandra supports up to 2 billion columns per row (2 billion 
 rows per partition in CQL), do you know what's the limit for HBase?

 Best regards,
 Marcelo Valle.

 From: aloksi...@gmail.com
 Subject: Re: data partitioning and data model

 You can use a key like (user_id + timestamp + alert_id) to get
 clustering of rows related to a user. To get better write throughput
 and distribution over the cluster, you could pre-split the table and
 use a consistent hash of the user_id as a row key prefix.

 Have you looked at the rowkey design section in the hbase book :
 http://hbase.apache.org/book.html#rowkey.design

 Alok

 On Fri, Feb 20, 2015 at 8:49 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hello,

 This is my first message in this mailing list, I just subscribed.

 I have been using Cassandra for the last few years and now I am trying to 
 create a POC using HBase. Therefore, I am reading the HBase docs but it's 
 been really hard to find how HBase behaves in some situations, when compared 
 to Cassandra. I thought maybe it was a good idea to ask here, as people in 
 this list might know the differences better than anyone else.

 What I want to do is creating a simple application optimized for writes (not 
 interested in HBase / Cassandra product comparisions here, I am assuming I 
 will use HBase and that's it, just wanna understand the best way of doing it 
 in HBase world). I want to be able to write alerts to the cluster, where 
 each alert would have columns like:
 - alert id
 - user id
 - date/time
 - alert data

 Later, I want to search for alerts per user, so my main query could be 
 considered to be something like:
 Select * from alerts where user_id = $id and date/time  10 days ago.

 I want to decide the data model for my application.

 Here are my questions:

 - In Cassandra, I would partition by user + day, as some users can have many 
 alerts and some just 1 or a few. In hbase, assuming all alerts for a user 
 would always fit in a single partition / region, can I just use user_id as 
 my row key and assume data will be distributed along the cluster?

 - Suppose I want to write 100 000 rows from a client machine and these are 
 from 30 000 users. What's the best manner to write these if I want to 
 optimize for writes? Should I batch all 100 k requests in one to a single 
 server? As I am trying to optimize for writes, I would like to split 

Re: data partitioning and data model

2015-02-23 Thread Alok Singh
Assuming the cluster is not manually balanced, hbase will try to
maintain roughly equal number of regions on each region server. So,
when you pre-split a table, the regions should get evenly spread out
to all of the region servers. That said, if you are pre-splitting a
new table on a cluster that already has a lot of existing
tables/regions, then you may see uneven distribution of regions of the
new table. Hbase will try to keep the cluster wide region distribution
even across all tables, without taking into account the distribution
of regions of a specific table.

Rebalancing shouldn't affect writes that are in flight.

After a split and moving of a region, sometimes data locality between
the region server and the data node that hosts the region data files
is lost. If you have significant load on your cluster, you will notice
an increase in read/write latency in the traffic to these regions. The
locality will eventually return after the next major compaction.

Links that have more details:
http://blog.cloudera.com/blog/2012/06/hbase-write-path/
http://www.ngdata.com/visualizing-hbase-flushes-and-compactions/

Alok

On Mon, Feb 23, 2015 at 8:42 AM, Marcelo Valle (BLOOMBERG/ LONDON)
mvallemil...@bloomberg.net wrote:
 Thanks Alok,

 I will take a good look at the link for sure.

 Just an additional question, I saw, reading this: 
 http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
 That HBase can rebalance data inside region servers to keep cluster balanced. 
 Does this happen also when using pre-loading?

 In the case of a rebalance, if I try to WRITE data to a record being 
 rebalanced, would the write performance be affected?

 Best regards,
 Marcelo Valle.

 From: user@hbase.apache.org
 Subject: Re: data partitioning and data model

 You don't want a lot of columns in a write heavy table. HBase stores
 the row key along with each cell/column (Though old, I find this
 still useful: 
 http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
  Having a lot of columns will amplify the amount of data being stored.

 That said, if there are only going to be a handful of alert_ids for a
 given user_id+timestamp row key, then you should be ok.

 The query Select * from table where user_id = X and timestamp  T and
 (alert_id = id1 or alert_id = id2) can be accomplished with either
 design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.

 Alok

 On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hi Alok,

 Thanks for the answer. Yes, I have read this section, but it was a little 
 too abstract for me, I think I was needing to check my understanding. Your 
 answer helped me to confirm I am on the right path, thanks for that.

 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?

 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/

 But I wonder which option would be better for my case. It seems column scans 
 are not so fast as row scans, but what would be the advantages of one design 
 over the other?

 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data

 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)

 Would I be able to do the same query using user_id + timestamp + alert_id as 
 row key?

 Also, I know Cassandra supports up to 2 billion columns per row (2 billion 
 rows per partition in CQL), do you know what's the limit for HBase?

 Best regards,
 Marcelo Valle.

 From: aloksi...@gmail.com
 Subject: Re: data partitioning and data model

 You can use a key like (user_id + timestamp + alert_id) to get
 clustering of rows related to a user. To get better write throughput
 and distribution over the cluster, you could pre-split the table and
 use a consistent hash of the user_id as a row key prefix.

 Have you looked at the rowkey design section in the hbase book :
 http://hbase.apache.org/book.html#rowkey.design

 Alok

 On Fri, Feb 20, 2015 at 8:49 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hello,

 This is my first message in this mailing list, I just subscribed.

 I have been using Cassandra for the last few years and now I am trying to 
 create a POC using HBase. Therefore, I am reading the HBase docs but it's 
 been really hard to find how HBase behaves in some situations, when 
 compared to Cassandra. I thought maybe it was a good idea to ask here, as 
 people in this list might know the differences better than anyone else.

 What I want to do is creating a simple application optimized for writes 
 (not interested in HBase / Cassandra product comparisions here, I 

Re: data partitioning and data model

2015-02-23 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Thanks a lot!

From: aloksi...@gmail.com 
Subject: Re: data partitioning and data model

I meant, in the normal course of operation, rebalancing will not
affect writes in flight. This is never an issue when pre splitting
because, by definition, splits occurred before data was written to the
regions.

If I choose to automatically split rows, but choosing a row key like
we described in this thread to keep data almost evenly distributed on
every partition, I might end up having the increase in read/write
latency when data is moving from a region to the other, although this
could be rare, is this right?
Yes.

Alok

On Mon, Feb 23, 2015 at 10:11 AM, Marcelo Valle (BLOOMBERG/ LONDON)
mvallemil...@bloomberg.net wrote:
 Alok, just to clarify:

 When you say Rebalancing shouldn't affect writes that are in flight. = you 
 mean just in the case I manually split the data on table creation right?
 If I choose to automatically split rows, but choosing a row key like we 
 described in this thread to keep data almost evenly distributed on every 
 partition, I might end up having the increase in read/write latency when data 
 is moving from a region to the other, although this could be rare, is this 
 right?

 From: user@hbase.apache.org
 Subject: Re: data partitioning and data model

 Assuming the cluster is not manually balanced, hbase will try to
 maintain roughly equal number of regions on each region server. So,
 when you pre-split a table, the regions should get evenly spread out
 to all of the region servers. That said, if you are pre-splitting a
 new table on a cluster that already has a lot of existing
 tables/regions, then you may see uneven distribution of regions of the
 new table. Hbase will try to keep the cluster wide region distribution
 even across all tables, without taking into account the distribution
 of regions of a specific table.

 Rebalancing shouldn't affect writes that are in flight.

 After a split and moving of a region, sometimes data locality between
 the region server and the data node that hosts the region data files
 is lost. If you have significant load on your cluster, you will notice
 an increase in read/write latency in the traffic to these regions. The
 locality will eventually return after the next major compaction.

 Links that have more details:
 http://blog.cloudera.com/blog/2012/06/hbase-write-path/
 http://www.ngdata.com/visualizing-hbase-flushes-and-compactions/

 Alok

 On Mon, Feb 23, 2015 at 8:42 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Thanks Alok,

 I will take a good look at the link for sure.

 Just an additional question, I saw, reading this: 
 http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
 That HBase can rebalance data inside region servers to keep cluster 
 balanced. Does this happen also when using pre-loading?

 In the case of a rebalance, if I try to WRITE data to a record being 
 rebalanced, would the write performance be affected?

 Best regards,
 Marcelo Valle.

 From: user@hbase.apache.org
 Subject: Re: data partitioning and data model

 You don't want a lot of columns in a write heavy table. HBase stores
 the row key along with each cell/column (Though old, I find this
 still useful: 
 http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
  Having a lot of columns will amplify the amount of data being stored.

 That said, if there are only going to be a handful of alert_ids for a
 given user_id+timestamp row key, then you should be ok.

 The query Select * from table where user_id = X and timestamp  T and
 (alert_id = id1 or alert_id = id2) can be accomplished with either
 design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.

 Alok

 On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hi Alok,

 Thanks for the answer. Yes, I have read this section, but it was a little 
 too abstract for me, I think I was needing to check my understanding. Your 
 answer helped me to confirm I am on the right path, thanks for that.

 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?

 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/

 But I wonder which option would be better for my case. It seems column 
 scans are not so fast as row scans, but what would be the advantages of one 
 design over the other?

 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data

 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)

 Would I be able to do the same query using user_id + timestamp + alert_id 
 as row key?

 Also, I know Cassandra 

Re: data partitioning and data model

2015-02-23 Thread Marcelo Valle (BLOOMBERG/ LONDON)
Alok, just to clarify:

When you say Rebalancing shouldn't affect writes that are in flight. = you 
mean just in the case I manually split the data on table creation right?
If I choose to automatically split rows, but choosing a row key like we 
described in this thread to keep data almost evenly distributed on every 
partition, I might end up having the increase in read/write latency when data 
is moving from a region to the other, although this could be rare, is this 
right?

From: user@hbase.apache.org 
Subject: Re: data partitioning and data model

Assuming the cluster is not manually balanced, hbase will try to
maintain roughly equal number of regions on each region server. So,
when you pre-split a table, the regions should get evenly spread out
to all of the region servers. That said, if you are pre-splitting a
new table on a cluster that already has a lot of existing
tables/regions, then you may see uneven distribution of regions of the
new table. Hbase will try to keep the cluster wide region distribution
even across all tables, without taking into account the distribution
of regions of a specific table.

Rebalancing shouldn't affect writes that are in flight.

After a split and moving of a region, sometimes data locality between
the region server and the data node that hosts the region data files
is lost. If you have significant load on your cluster, you will notice
an increase in read/write latency in the traffic to these regions. The
locality will eventually return after the next major compaction.

Links that have more details:
http://blog.cloudera.com/blog/2012/06/hbase-write-path/
http://www.ngdata.com/visualizing-hbase-flushes-and-compactions/

Alok

On Mon, Feb 23, 2015 at 8:42 AM, Marcelo Valle (BLOOMBERG/ LONDON)
mvallemil...@bloomberg.net wrote:
 Thanks Alok,

 I will take a good look at the link for sure.

 Just an additional question, I saw, reading this: 
 http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
 That HBase can rebalance data inside region servers to keep cluster balanced. 
 Does this happen also when using pre-loading?

 In the case of a rebalance, if I try to WRITE data to a record being 
 rebalanced, would the write performance be affected?

 Best regards,
 Marcelo Valle.

 From: user@hbase.apache.org
 Subject: Re: data partitioning and data model

 You don't want a lot of columns in a write heavy table. HBase stores
 the row key along with each cell/column (Though old, I find this
 still useful: 
 http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
  Having a lot of columns will amplify the amount of data being stored.

 That said, if there are only going to be a handful of alert_ids for a
 given user_id+timestamp row key, then you should be ok.

 The query Select * from table where user_id = X and timestamp  T and
 (alert_id = id1 or alert_id = id2) can be accomplished with either
 design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.

 Alok

 On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hi Alok,

 Thanks for the answer. Yes, I have read this section, but it was a little 
 too abstract for me, I think I was needing to check my understanding. Your 
 answer helped me to confirm I am on the right path, thanks for that.

 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?

 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/

 But I wonder which option would be better for my case. It seems column scans 
 are not so fast as row scans, but what would be the advantages of one design 
 over the other?

 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data

 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)

 Would I be able to do the same query using user_id + timestamp + alert_id as 
 row key?

 Also, I know Cassandra supports up to 2 billion columns per row (2 billion 
 rows per partition in CQL), do you know what's the limit for HBase?

 Best regards,
 Marcelo Valle.

 From: aloksi...@gmail.com
 Subject: Re: data partitioning and data model

 You can use a key like (user_id + timestamp + alert_id) to get
 clustering of rows related to a user. To get better write throughput
 and distribution over the cluster, you could pre-split the table and
 use a consistent hash of the user_id as a row key prefix.

 Have you looked at the rowkey design section in the hbase book :
 http://hbase.apache.org/book.html#rowkey.design

 Alok

 On Fri, Feb 20, 2015 at 8:49 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hello,

 This is my first 

Re: data partitioning and data model

2015-02-23 Thread Alok Singh
I meant, in the normal course of operation, rebalancing will not
affect writes in flight. This is never an issue when pre splitting
because, by definition, splits occurred before data was written to the
regions.

If I choose to automatically split rows, but choosing a row key like
we described in this thread to keep data almost evenly distributed on
every partition, I might end up having the increase in read/write
latency when data is moving from a region to the other, although this
could be rare, is this right?
Yes.

Alok

On Mon, Feb 23, 2015 at 10:11 AM, Marcelo Valle (BLOOMBERG/ LONDON)
mvallemil...@bloomberg.net wrote:
 Alok, just to clarify:

 When you say Rebalancing shouldn't affect writes that are in flight. = you 
 mean just in the case I manually split the data on table creation right?
 If I choose to automatically split rows, but choosing a row key like we 
 described in this thread to keep data almost evenly distributed on every 
 partition, I might end up having the increase in read/write latency when data 
 is moving from a region to the other, although this could be rare, is this 
 right?

 From: user@hbase.apache.org
 Subject: Re: data partitioning and data model

 Assuming the cluster is not manually balanced, hbase will try to
 maintain roughly equal number of regions on each region server. So,
 when you pre-split a table, the regions should get evenly spread out
 to all of the region servers. That said, if you are pre-splitting a
 new table on a cluster that already has a lot of existing
 tables/regions, then you may see uneven distribution of regions of the
 new table. Hbase will try to keep the cluster wide region distribution
 even across all tables, without taking into account the distribution
 of regions of a specific table.

 Rebalancing shouldn't affect writes that are in flight.

 After a split and moving of a region, sometimes data locality between
 the region server and the data node that hosts the region data files
 is lost. If you have significant load on your cluster, you will notice
 an increase in read/write latency in the traffic to these regions. The
 locality will eventually return after the next major compaction.

 Links that have more details:
 http://blog.cloudera.com/blog/2012/06/hbase-write-path/
 http://www.ngdata.com/visualizing-hbase-flushes-and-compactions/

 Alok

 On Mon, Feb 23, 2015 at 8:42 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Thanks Alok,

 I will take a good look at the link for sure.

 Just an additional question, I saw, reading this: 
 http://stackoverflow.com/questions/13741946/role-of-datanode-regionserver-in-hbase-hadoop-integration
 That HBase can rebalance data inside region servers to keep cluster 
 balanced. Does this happen also when using pre-loading?

 In the case of a rebalance, if I try to WRITE data to a record being 
 rebalanced, would the write performance be affected?

 Best regards,
 Marcelo Valle.

 From: user@hbase.apache.org
 Subject: Re: data partitioning and data model

 You don't want a lot of columns in a write heavy table. HBase stores
 the row key along with each cell/column (Though old, I find this
 still useful: 
 http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
  Having a lot of columns will amplify the amount of data being stored.

 That said, if there are only going to be a handful of alert_ids for a
 given user_id+timestamp row key, then you should be ok.

 The query Select * from table where user_id = X and timestamp  T and
 (alert_id = id1 or alert_id = id2) can be accomplished with either
 design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.

 Alok

 On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hi Alok,

 Thanks for the answer. Yes, I have read this section, but it was a little 
 too abstract for me, I think I was needing to check my understanding. Your 
 answer helped me to confirm I am on the right path, thanks for that.

 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?

 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/

 But I wonder which option would be better for my case. It seems column 
 scans are not so fast as row scans, but what would be the advantages of one 
 design over the other?

 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data

 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)

 Would I be able to do the same query using user_id + timestamp + alert_id 
 as row key?

 Also, I know Cassandra supports up to 2 billion columns per row (2 billion 
 rows per partition in CQL), do you 

Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Jean-Marc Spaggiari
You have no other choice than removing those files... you will loose the
related data but it should be fine if they are only HFiles. Do you have the
list of corrupted files? What kind of files it is?

Also, have you lost a node or a disk? How have you lost about 150 blocks?

JM

2015-02-23 2:47 GMT-05:00 Arinto Murdopo ari...@gmail.com:

 Hi all,

 We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
 2.0.0-cdh4.6.0).
 For all of our tables, we set the replication factor to 1 (dfs.replication
 = 1 in hbase-site.xml). We set to 1 because we want to minimize the HDFS
 usage (now we realize we should set this value to at least 2, because
 failure is a norm in distributed systems).

 Due to the amount of data, at some point, we have low disk space in HDFS
 and one of our DNs was down. Now we have these problems in HBase and HDFS
 although we have recovered our DN.

 *Issue#1*. Some of HBase region always in transition. '*hbase hbck
 -repair*'
 is stuck because it's waiting for region transition to finish. Some output

 *hbase(main):003:0 status 'detailed'*
 *12 regionsInTransition*
 *

 plr_id_insta_media_live,\x02:;6;7;398962:3:399a49:653:64,1421565172917.1528f288473632aca2636443574a6ba1.
 state=OPENING, ts=1424227696897, server=null*
 *

 plr_sg_insta_media_live,\x0098;522:997;8798665a64;67879,1410768824800.2c79bbc5c0dc2d2b39c04c8abc0a90ff.
 state=OFFLINE, ts=1424227714203, server=null*
 *

 plr_sg_insta_media_live,\x00465892:9935773828;a4459;649,1410767723471.55097cfc60bc9f50303dadb02abcd64b.
 state=OPENING, ts=1424227701234, server=null*
 *

 plr_sg_insta_media_live,\x00474973488232837733a38744,1410767723471.740d6655afb74a2ff421c6ef16037f57.
 state=OPENING, ts=1424227708053, server=null*
 *

 plr_id_insta_media_live,\x02::449::4;:466;3988a6432677;3,1419435100617.7caf3d749dce37037eec9ccc29d272a1.
 state=OPENING, ts=1424227701484, server=null*
 *

 plr_sg_insta_media_live,\x05779793546323;::4:4a3:8227928,1418845792479.81c4da129ae5b7b204d5373d9e0fea3d.
 state=OPENING, ts=1424227705353, server=null*
 *

 plr_sg_insta_media_live,\x009;5:686348963:33:5a5634887,1410769837567.8a9ded24960a7787ca016e2073b24151.
 state=OPENING, ts=1424227706293, server=null*
 *

 plr_sg_insta_media_live,\x0375;6;7377578;84226a7663792,1418980694076.a1e1c98f646ee899010f19a9c693c67c.
 state=OPENING, ts=1424227680569, server=null*
 *

 plr_sg_insta_media_live,\x018;3826368274679364a3;;73457;,1421425643816.b04ffda1b2024bac09c9e6246fb7b183.
 state=OPENING, ts=1424227680538, server=null*
 *

 plr_sg_insta_media_live,\x0154752;22:43377542:a:86:239,1410771044924.c57d6b4d23f21d3e914a91721a99ce12.
 state=OPENING, ts=1424227710847, server=null*
 *

 plr_sg_insta_media_live,\x0069;7;9384697:;8685a885485:,1410767928822.c7b5e53cdd9e1007117bcaa199b30d1c.
 state=OPENING, ts=1424227700962, server=null*
 *

 plr_sg_insta_media_live,\x04994537646:78233569a3467:987;7,1410787903804.cd49ec64a0a417aa11949c2bc2d3df6e.
 state=OPENING, ts=1424227691774, server=null*


 *Issue#2*. The next step that we do is to check HDFS file status using
 '*hdfs
 fsck /*'. It shows that the filesystem '/' is corrupted with these
 statistics
 * Total size:15494284950796 B (Total open files size: 17179869184 B)*
 * Total dirs:9198*
 * Total files:   124685 (Files currently being written: 21)*
 * Total blocks (validated):  219620 (avg. block size 70550427 B) (Total
 open file blocks (not validated): 144)*
 *  *
 *  CORRUPT FILES:42*
 *  MISSING BLOCKS:   142*
 *  MISSING SIZE: 14899184084 B*
 *  CORRUPT BLOCKS:   142*
 *  *
 * Corrupt blocks:142*
 * Number of data-nodes:  14*
 * Number of racks:   1*
 *FSCK ended at Tue Feb 17 17:25:18 SGT 2015 in 3026 milliseconds*


 *The filesystem under path '/' is CORRUPT*

 So it seems that HDFS loses some of its block due to DN failures and since
 the dfs.replication factor is 1, it could not recover the missing blocks.

 *Issue#3*. Although '*hbase hbck -repair*' is stuck, we are able to run
 '*hbase
 hbck -fixHdfsHoles*'. We notice this following error messages (I copied
 some of them to represent each type of error messages that we have).
 - *ERROR: Region { meta =

 plr_id_insta_media_live,\x02:;6;7;398962:3:399a49:653:64,1421565172917.1528f288473632aca2636443574a6ba1.,
 hdfs = hdfs://nameservice1/hbase/plr_id_insta_media_live/1528f2884*
 *73632aca2636443574a6ba1, deployed =  } not deployed on any region
 server.*
 - *ERROR: Region { meta = null, hdfs =

 hdfs://nameservice1/hbase/plr_sg_insta_media_live/8473d25be5980c169bff13cf90229939,
 deployed =  } on HDFS, but not listed in META or deployed on any region
 server*
 *- ERROR: Region { meta =

 plr_sg_insta_media_live,\x0293:729769;975376;2a33995622;3,1421985489851.8819ebd296f075513056be4bbd30ee9c.,
 hdfs = null, deployed =  } found in META, but not in HDFS or deployed on
 any region server.*
 -ERROR: There is a hole in the region chain 

Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Michael Segel

 On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
 
 We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
 2.0.0-cdh4.6.0).
 For all of our tables, we set the replication factor to 1 (dfs.replication
 = 1 in hbase-site.xml). We set to 1 because we want to minimize the HDFS
 usage (now we realize we should set this value to at least 2, because
 failure is a norm in distributed systems).


Sorry, but you really want this to be a replication value of at least 3 and not 
2. 

Suppose you have corruption but not a lost block. Which copy of the two is 
right?
With 3, you can compare the three and hopefully 2 of the 3 will match. 



smime.p7s
Description: S/MIME cryptographic signature


Re: data partitioning and data model

2015-02-23 Thread Michael Segel
Hi, 

Yes you would want to start your key by user_id. 
But you don’t need the timestamp. The user_id + alert_id should be enough on 
the key. 
If you want to get fancy…

If your alert_id is not a number, you could use the EPOCH - Timestamp as a way 
to invert the order of the alerts so that the latest alert would be first.
If your alert_id is a number  you could just use EPOCH - alert_id to get the 
alerts in reverse order with the latest alert first. 

Depending on the number of alerts, you could make the table wider and store 
multiple alerts in a row… but that brings in a different debate when it comes 
to row width and how you use the data. 

 On Feb 20, 2015, at 12:55 PM, Alok Singh aloksi...@gmail.com wrote:
 
 You can use a key like (user_id + timestamp + alert_id) to get
 clustering of rows related to a user. To get better write throughput
 and distribution over the cluster, you could pre-split the table and
 use a consistent hash of the user_id as a row key prefix.
 
 Have you looked at the rowkey design section in the hbase book :
 http://hbase.apache.org/book.html#rowkey.design
 
 Alok
 
 On Fri, Feb 20, 2015 at 8:49 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hello,
 
 This is my first message in this mailing list, I just subscribed.
 
 I have been using Cassandra for the last few years and now I am trying to 
 create a POC using HBase. Therefore, I am reading the HBase docs but it's 
 been really hard to find how HBase behaves in some situations, when compared 
 to Cassandra. I thought maybe it was a good idea to ask here, as people in 
 this list might know the differences better than anyone else.
 
 What I want to do is creating a simple application optimized for writes (not 
 interested in HBase / Cassandra product comparisions here, I am assuming I 
 will use HBase and that's it, just wanna understand the best way of doing it 
 in HBase world). I want to be able to write alerts to the cluster, where 
 each alert would have columns like:
 - alert id
 - user id
 - date/time
 - alert data
 
 Later, I want to search for alerts per user, so my main query could be 
 considered to be something like:
 Select * from alerts where user_id = $id and date/time  10 days ago.
 
 I want to decide the data model for my application.
 
 Here are my questions:
 
 - In Cassandra, I would partition by user + day, as some users can have many 
 alerts and some just 1 or a few. In hbase, assuming all alerts for a user 
 would always fit in a single partition / region, can I just use user_id as 
 my row key and assume data will be distributed along the cluster?
 
 - Suppose I want to write 100 000 rows from a client machine and these are 
 from 30 000 users. What's the best manner to write these if I want to 
 optimize for writes? Should I batch all 100 k requests in one to a single 
 server? As I am trying to optimize for writes, I would like to split these 
 requests across several nodes instead of sending them all to one. I found 
 this article: 
 http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ But 
 not sure if it's what I need
 
 Thanks in advance!
 
 Best regards,
 Marcelo.
 



smime.p7s
Description: S/MIME cryptographic signature


Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Nick Dimiduk
HBase/HDFS are maintaining block checksums, so presumably a corrupted block
would fail checksum validation. Increasing the number of replicas increases
the odds that you'll still have a valid block. I'm not an HDFS expert, but
I would be very surprised if HDFS is validating a questionable block via
byte-wise comparison over the network amongst the replica peers.

On Mon, Feb 23, 2015 at 12:25 PM, Michael Segel mse...@segel.com wrote:


 On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:

 We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
 2.0.0-cdh4.6.0).
 For all of our tables, we set the replication factor to 1 (dfs.replication
 = 1 in hbase-site.xml). We set to 1 because we want to minimize the HDFS
 usage (now we realize we should set this value to at least 2, because
 failure is a norm in distributed systems).



 Sorry, but you really want this to be a replication value of at least 3
 and not 2.

 Suppose you have corruption but not a lost block. Which copy of the two is
 right?
 With 3, you can compare the three and hopefully 2 of the 3 will match.




Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-23 Thread Nick Dimiduk
Oh I see snappy and no block encoder. How about stack traces while the
endless file is being created (like, a couple/sec)? Poor man's sampler.

On Monday, February 23, 2015, Nick Dimiduk ndimi...@gmail.com wrote:

 Can anyone reproducing this provide additional details requested earlier:

 you using any BlockEncoding or Compression with this column family? Any
 other store/table configuration? This happens repeatably? Can you provide
 jstack of the RS process along with log lines while this file is growing
 excessively?

 On Monday, February 23, 2015, sathyafmt sathya...@gmail.com
 javascript:_e(%7B%7D,'cvml','sathya...@gmail.com'); wrote:

 John - No solution yet, I didn't hear anything back from the group.. I am
 still running into this issue. Are you running on a VM or bare-metal ?

 Thanks
 -sathya



 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068547.html
 Sent from the HBase User mailing list archive at Nabble.com.




Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-23 Thread sathyafmt
Nick,

Look at my reply from 02/06/2015, I have the stack traces on my google
drive...

===
We ran into this issue again at the customer site  I collected the region
server dumps (25 of them at 10s intervals). I uploaded it to my  google
drive
https://drive.google.com/file/d/0B53HyylRdzUuZi1aMXRBd2V2V2s/view?usp=sharing 
 
(apaste.info has a 1M cap, this file is around 6M)
===

thanks
-sathya




--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068553.html
Sent from the HBase User mailing list archive at Nabble.com.


Error: Could not find or load main class .usr.java.packages.lib.amd64:.usr.lib64:.lib64:.lib:.usr.lib

2015-02-23 Thread jeypijeypi
I downloaded the latest hbase stable release which is 1.0.0. I extracted the
the file, added JAVA_HOME in hbase_env.sh and upon execution of
start-hbase.sh, I am getting this error:

Error: Could not find or load main class
.usr.java.packages.lib.amd64:.usr.lib64:.lib64:.lib:.usr.lib

I am using jdk 1.7.0. I have Apache Falcon, Ranger and Hadoop installed, and
all is running without issues.





--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Error-Could-not-find-or-load-main-class-usr-java-packages-lib-amd64-usr-lib64-lib64-lib-usr-lib-tp4068558.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Michael Segel
I’m sorry, but I implied checking the checksums of the blocks. 
Didn’t think I needed to spell it out.  Next time I’ll be a bit more precise. 

 On Feb 23, 2015, at 2:34 PM, Nick Dimiduk ndimi...@gmail.com wrote:
 
 HBase/HDFS are maintaining block checksums, so presumably a corrupted block
 would fail checksum validation. Increasing the number of replicas increases
 the odds that you'll still have a valid block. I'm not an HDFS expert, but
 I would be very surprised if HDFS is validating a questionable block via
 byte-wise comparison over the network amongst the replica peers.
 
 On Mon, Feb 23, 2015 at 12:25 PM, Michael Segel mse...@segel.com wrote:
 
 
 On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
 
 We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
 2.0.0-cdh4.6.0).
 For all of our tables, we set the replication factor to 1 (dfs.replication
 = 1 in hbase-site.xml). We set to 1 because we want to minimize the HDFS
 usage (now we realize we should set this value to at least 2, because
 failure is a norm in distributed systems).
 
 
 
 Sorry, but you really want this to be a replication value of at least 3
 and not 2.
 
 Suppose you have corruption but not a lost block. Which copy of the two is
 right?
 With 3, you can compare the three and hopefully 2 of the 3 will match.
 
 



smime.p7s
Description: S/MIME cryptographic signature


Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-23 Thread sathyafmt
John - No solution yet, I didn't hear anything back from the group.. I am
still running into this issue. Are you running on a VM or bare-metal ?

Thanks
-sathya



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068547.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Ted Yu
Arinto:
Probably you should take a look at HBASE-12949.

Cheers

On Mon, Feb 23, 2015 at 5:25 PM, Arinto Murdopo ari...@gmail.com wrote:

 @JM:
 You mentioned about deleting the files, are you referring to HDFS files
 or file on HBase?

 Our cluster have 15 nodes. We used 14 of them as DN. Actually we tried to
 enable the remaining one as DN (so that we have 15 DN), but then we
 disabled it (so now we have 14 again). Probably our crawlers write some
 data into the additional DN without any replication. Maybe I could try to
 enable again the DN.

 I don't have the list of the corrupted files yet. I notice that when I try
 to Get some of the files, my HBase client code throws these exceptions:
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
 attempts=2, exceptions:
 Mon Feb 23 17:49:32 SGT 2015,
 org.apache.hadoop.hbase.client.HTable$3@11ff4a1c,
 org.apache.hadoop.hbase.NotServingRegionException:
 org.apache.hadoop.hbase.NotServingRegionException: Region is not online:

 plr_sg_insta_media_live,\x0177998597896:953:5:a5:58786,1410771627251.6c323832d2dc77c586f1cf6441c7ef6e.

 Can I use these exceptions to determine the corrupted files?
 The files are media data (images or videos) obtained from the internet.

 @Michael Segel: Yup, 3 is the default and recommended value. We were
 overwhelmed with the amount of data, so we foolishly reduced our
 replication factor. We have learnt the lesson the hard way :).

 Fortunately it's okay to lose the data, i.e. we can easily recover them
 from our other data.



 Arinto
 www.otnira.com

 On Tue, Feb 24, 2015 at 8:06 AM, Michael Segel mse...@segel.com wrote:

  I’m sorry, but I implied checking the checksums of the blocks.
  Didn’t think I needed to spell it out.  Next time I’ll be a bit more
  precise.
 
   On Feb 23, 2015, at 2:34 PM, Nick Dimiduk ndimi...@gmail.com wrote:
  
   HBase/HDFS are maintaining block checksums, so presumably a corrupted
  block
   would fail checksum validation. Increasing the number of replicas
  increases
   the odds that you'll still have a valid block. I'm not an HDFS expert,
  but
   I would be very surprised if HDFS is validating a questionable block
  via
   byte-wise comparison over the network amongst the replica peers.
  
   On Mon, Feb 23, 2015 at 12:25 PM, Michael Segel mse...@segel.com
  wrote:
  
  
   On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
  
   We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
   2.0.0-cdh4.6.0).
   For all of our tables, we set the replication factor to 1
  (dfs.replication
   = 1 in hbase-site.xml). We set to 1 because we want to minimize the
 HDFS
   usage (now we realize we should set this value to at least 2, because
   failure is a norm in distributed systems).
  
  
  
   Sorry, but you really want this to be a replication value of at least
 3
   and not 2.
  
   Suppose you have corruption but not a lost block. Which copy of the
 two
  is
   right?
   With 3, you can compare the three and hopefully 2 of the 3 will match.
  
  
 
 



Re: data partitioning and data model

2015-02-23 Thread Michael Segel
Yes and no. 

Its a bit more complicated and it is also data dependent and how you’re using 
the data. 

I wouldn’t go too thin and I wouldn’t go to fat. 

 On Feb 20, 2015, at 2:19 PM, Alok Singh aloksi...@gmail.com wrote:
 
 You don't want a lot of columns in a write heavy table. HBase stores
 the row key along with each cell/column (Though old, I find this
 still useful: 
 http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
 Having a lot of columns will amplify the amount of data being stored.
 
 That said, if there are only going to be a handful of alert_ids for a
 given user_id+timestamp row key, then you should be ok.
 
 The query Select * from table where user_id = X and timestamp  T and
 (alert_id = id1 or alert_id = id2) can be accomplished with either
 design. See QualifierFilter and FuzzyRowFilter docs to get some ideas.
 
 Alok
 
 On Fri, Feb 20, 2015 at 11:21 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hi Alok,
 
 Thanks for the answer. Yes, I have read this section, but it was a little 
 too abstract for me, I think I was needing to check my understanding. Your 
 answer helped me to confirm I am on the right path, thanks for that.
 
 One question: if instead of using user_id + timestamp + alert_id  I use 
 user_id + timestamp as row key, I would still be able to store alert_id + 
 alert_data in columns, right?
 
 I took the idea from the last section of this link: 
 http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/
 
 But I wonder which option would be better for my case. It seems column scans 
 are not so fast as row scans, but what would be the advantages of one design 
 over the other?
 
 If I use something like:
 Row key: user_id + timestamp
 Column prefix: alert_id
 Column value: json with alert data
 
 Would I be able to do a query like the one bellow?
 Select * from table where user_id = X and timestamp  T and (alert_id = id1 
 or alert_id = id2)
 
 Would I be able to do the same query using user_id + timestamp + alert_id as 
 row key?
 
 Also, I know Cassandra supports up to 2 billion columns per row (2 billion 
 rows per partition in CQL), do you know what's the limit for HBase?
 
 Best regards,
 Marcelo Valle.
 
 From: aloksi...@gmail.com
 Subject: Re: data partitioning and data model
 
 You can use a key like (user_id + timestamp + alert_id) to get
 clustering of rows related to a user. To get better write throughput
 and distribution over the cluster, you could pre-split the table and
 use a consistent hash of the user_id as a row key prefix.
 
 Have you looked at the rowkey design section in the hbase book :
 http://hbase.apache.org/book.html#rowkey.design
 
 Alok
 
 On Fri, Feb 20, 2015 at 8:49 AM, Marcelo Valle (BLOOMBERG/ LONDON)
 mvallemil...@bloomberg.net wrote:
 Hello,
 
 This is my first message in this mailing list, I just subscribed.
 
 I have been using Cassandra for the last few years and now I am trying to 
 create a POC using HBase. Therefore, I am reading the HBase docs but it's 
 been really hard to find how HBase behaves in some situations, when 
 compared to Cassandra. I thought maybe it was a good idea to ask here, as 
 people in this list might know the differences better than anyone else.
 
 What I want to do is creating a simple application optimized for writes 
 (not interested in HBase / Cassandra product comparisions here, I am 
 assuming I will use HBase and that's it, just wanna understand the best way 
 of doing it in HBase world). I want to be able to write alerts to the 
 cluster, where each alert would have columns like:
 - alert id
 - user id
 - date/time
 - alert data
 
 Later, I want to search for alerts per user, so my main query could be 
 considered to be something like:
 Select * from alerts where user_id = $id and date/time  10 days ago.
 
 I want to decide the data model for my application.
 
 Here are my questions:
 
 - In Cassandra, I would partition by user + day, as some users can have 
 many alerts and some just 1 or a few. In hbase, assuming all alerts for a 
 user would always fit in a single partition / region, can I just use 
 user_id as my row key and assume data will be distributed along the cluster?
 
 - Suppose I want to write 100 000 rows from a client machine and these are 
 from 30 000 users. What's the best manner to write these if I want to 
 optimize for writes? Should I batch all 100 k requests in one to a single 
 server? As I am trying to optimize for writes, I would like to split these 
 requests across several nodes instead of sending them all to one. I found 
 this article: 
 http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ But 
 not sure if it's what I need
 
 Thanks in advance!
 
 Best regards,
 Marcelo.
 
 
 



smime.p7s
Description: S/MIME cryptographic signature


Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Jean-Marc Spaggiari
2015-02-23 20:25 GMT-05:00 Arinto Murdopo ari...@gmail.com:

 @JM:
 You mentioned about deleting the files, are you referring to HDFS files
 or file on HBase?


Your HBase files are stored in HDFS. So I think we are refering to the same
thing. Look into /hbase in our HDFS to find HBase files.




 Our cluster have 15 nodes. We used 14 of them as DN. Actually we tried to
 enable the remaining one as DN (so that we have 15 DN), but then we
 disabled it (so now we have 14 again). Probably our crawlers write some
 data into the additional DN without any replication. Maybe I could try to
 enable again the DN.


That's a very valid option. If you still have the DN directories, just
enable it back to see if you can recover the blocks...



 I don't have the list of the corrupted files yet. I notice that when I try
 to Get some of the files, my HBase client code throws these exceptions:
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
 attempts=2, exceptions:
 Mon Feb 23 17:49:32 SGT 2015,
 org.apache.hadoop.hbase.client.HTable$3@11ff4a1c,
 org.apache.hadoop.hbase.NotServingRegionException:
 org.apache.hadoop.hbase.NotServingRegionException: Region is not online:

 plr_sg_insta_media_live,\x0177998597896:953:5:a5:58786,1410771627251.6c323832d2dc77c586f1cf6441c7ef6e.


FSCK should give ou the list of corrupt files. Can you extract it from
there?




 Can I use these exceptions to determine the corrupted files?
 The files are media data (images or videos) obtained from the internet.


This exception gives you all the hints for a directory most probably under
/hbase/plr_sg_insta_media_live/6c323832d2dc77c586f1cf6441c7ef6e

Files under this directory might be corrupted but you need to find which
files. If it's a HFiles it's easy. If it's the .regioninfo it's a bit more
tricky.

JM



 Arinto
 www.otnira.com

 On Tue, Feb 24, 2015 at 8:06 AM, Michael Segel mse...@segel.com wrote:

  I’m sorry, but I implied checking the checksums of the blocks.
  Didn’t think I needed to spell it out.  Next time I’ll be a bit more
  precise.
 
   On Feb 23, 2015, at 2:34 PM, Nick Dimiduk ndimi...@gmail.com wrote:
  
   HBase/HDFS are maintaining block checksums, so presumably a corrupted
  block
   would fail checksum validation. Increasing the number of replicas
  increases
   the odds that you'll still have a valid block. I'm not an HDFS expert,
  but
   I would be very surprised if HDFS is validating a questionable block
  via
   byte-wise comparison over the network amongst the replica peers.
  
   On Mon, Feb 23, 2015 at 12:25 PM, Michael Segel mse...@segel.com
  wrote:
  
  
   On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
  
   We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
   2.0.0-cdh4.6.0).
   For all of our tables, we set the replication factor to 1
  (dfs.replication
   = 1 in hbase-site.xml). We set to 1 because we want to minimize the
 HDFS
   usage (now we realize we should set this value to at least 2, because
   failure is a norm in distributed systems).
  
  
  
   Sorry, but you really want this to be a replication value of at least
 3
   and not 2.
  
   Suppose you have corruption but not a lost block. Which copy of the
 two
  is
   right?
   With 3, you can compare the three and hopefully 2 of the 3 will match.
  
  
 
 



Re: HBase with opentsdb creates huge .tmp file runs out of hdfs space

2015-02-23 Thread Nick Dimiduk
Can anyone reproducing this provide additional details requested earlier:

you using any BlockEncoding or Compression with this column family? Any
other store/table configuration? This happens repeatably? Can you provide
jstack of the RS process along with log lines while this file is growing
excessively?

On Monday, February 23, 2015, sathyafmt sathya...@gmail.com wrote:

 John - No solution yet, I didn't hear anything back from the group.. I am
 still running into this issue. Are you running on a VM or bare-metal ?

 Thanks
 -sathya



 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/HBase-with-opentsdb-creates-huge-tmp-file-runs-out-of-hdfs-space-tp4067577p4068547.html
 Sent from the HBase User mailing list archive at Nabble.com.



Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Arinto Murdopo
On Tue, Feb 24, 2015 at 9:46 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 I don't have the list of the corrupted files yet. I notice that when I try
  to Get some of the files, my HBase client code throws these exceptions:
  org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
  attempts=2, exceptions:
  Mon Feb 23 17:49:32 SGT 2015,
  org.apache.hadoop.hbase.client.HTable$3@11ff4a1c,
  org.apache.hadoop.hbase.NotServingRegionException:
  org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
 
 
 plr_sg_insta_media_live,\x0177998597896:953:5:a5:58786,1410771627251.6c323832d2dc77c586f1cf6441c7ef6e.
 

 FSCK should give ou the list of corrupt files. Can you extract it from
 there?


Yup, I managed to extract them. We have corrupt files as well as missing
files. Luckily there's no .regionfile corrupted or missing. I'll read more
about HFile before updating this thread more. :)


 
  Can I use these exceptions to determine the corrupted files?
  The files are media data (images or videos) obtained from the internet.
 

 This exception gives you all the hints for a directory most probably under
 /hbase/plr_sg_insta_media_live/6c323832d2dc77c586f1cf6441c7ef6e

 Files under this directory might be corrupted but you need to find which
 files. If it's a HFiles it's easy. If it's the .regioninfo it's a bit more
 tricky.




Arinto
www.otnira.com


Re: HBase Region always in transition + corrupt HDFS

2015-02-23 Thread Arinto Murdopo
@JM:
You mentioned about deleting the files, are you referring to HDFS files
or file on HBase?

Our cluster have 15 nodes. We used 14 of them as DN. Actually we tried to
enable the remaining one as DN (so that we have 15 DN), but then we
disabled it (so now we have 14 again). Probably our crawlers write some
data into the additional DN without any replication. Maybe I could try to
enable again the DN.

I don't have the list of the corrupted files yet. I notice that when I try
to Get some of the files, my HBase client code throws these exceptions:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
Mon Feb 23 17:49:32 SGT 2015,
org.apache.hadoop.hbase.client.HTable$3@11ff4a1c,
org.apache.hadoop.hbase.NotServingRegionException:
org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
plr_sg_insta_media_live,\x0177998597896:953:5:a5:58786,1410771627251.6c323832d2dc77c586f1cf6441c7ef6e.

Can I use these exceptions to determine the corrupted files?
The files are media data (images or videos) obtained from the internet.

@Michael Segel: Yup, 3 is the default and recommended value. We were
overwhelmed with the amount of data, so we foolishly reduced our
replication factor. We have learnt the lesson the hard way :).

Fortunately it's okay to lose the data, i.e. we can easily recover them
from our other data.



Arinto
www.otnira.com

On Tue, Feb 24, 2015 at 8:06 AM, Michael Segel mse...@segel.com wrote:

 I’m sorry, but I implied checking the checksums of the blocks.
 Didn’t think I needed to spell it out.  Next time I’ll be a bit more
 precise.

  On Feb 23, 2015, at 2:34 PM, Nick Dimiduk ndimi...@gmail.com wrote:
 
  HBase/HDFS are maintaining block checksums, so presumably a corrupted
 block
  would fail checksum validation. Increasing the number of replicas
 increases
  the odds that you'll still have a valid block. I'm not an HDFS expert,
 but
  I would be very surprised if HDFS is validating a questionable block
 via
  byte-wise comparison over the network amongst the replica peers.
 
  On Mon, Feb 23, 2015 at 12:25 PM, Michael Segel mse...@segel.com
 wrote:
 
 
  On Feb 23, 2015, at 1:47 AM, Arinto Murdopo ari...@gmail.com wrote:
 
  We're running HBase (0.94.15-cdh4.6.0) on top of HDFS (Hadoop
  2.0.0-cdh4.6.0).
  For all of our tables, we set the replication factor to 1
 (dfs.replication
  = 1 in hbase-site.xml). We set to 1 because we want to minimize the HDFS
  usage (now we realize we should set this value to at least 2, because
  failure is a norm in distributed systems).
 
 
 
  Sorry, but you really want this to be a replication value of at least 3
  and not 2.
 
  Suppose you have corruption but not a lost block. Which copy of the two
 is
  right?
  With 3, you can compare the three and hopefully 2 of the 3 will match.
 
 




RE: HTable or HConnectionManager, how a client connect to HBase?

2015-02-23 Thread Liu, Ming (HPIT-GADSC)
Thanks, Enis,

Your reply is very clear,  I finally understand it now.

Best Regards,
Ming
-Original Message-
From: Enis Söztutar [mailto:enis@gmail.com] 
Sent: Thursday, February 19, 2015 10:41 AM
To: hbase-user
Subject: Re: HTable or HConnectionManager, how a client connect to HBase?

It is a bit more complex than that. It is actually a hash of some subset of the 
configuration properties. See HConnectionKey class if you want to learn more. 
But the important thing is that with the new style, you do not need to worry 
anything about these since there is no implicit connection sharing. Everything 
is explicit now.

Enis

On Tue, Feb 17, 2015 at 11:50 PM, Serega Sheypak serega.shey...@gmail.com
wrote:

 Hi, Enis Söztutar
 You've wrote:
 You are right that the constructor new HTable(Configuration, ..) 
 will
 share the underlying connection if same configuration object is used.

 What do it mean the same? is equality checked using reference (java 
 == ) or using equals(Object other) method?


 2015-02-18 7:34 GMT+03:00 Enis Söztutar enis@gmail.com:

  Hi,
 
  You are right that the constructor new HTable(Configuration, ..) 
  will
 share
  the underlying connection if same configuration object is used.
 Connection
  is a heavy weight object, that holds the zookeeper connection, rpc
 client,
  socket connections to multiple region servers, master, and the 
  thread
 pool,
  etc. You definitely do not want to create multiple connections per
 process
  unless you know what you are doing.
 
  The model is changed, and the old way of HTable(Configuration, ..) 
  is deprecated because, we want to make the Connection lifecycle 
  management explicit. In the new model, an opened Connection is 
  closed by the user again, and light weight Table instances are obtained 
  from the Connection.
  Having HTable's share their connections implicitly makes reasoning 
  about
 it
  too hard. The new model should be pretty easy to follow.
 
  Enis
 
  On Sat, Feb 14, 2015 at 6:45 AM, Liu, Ming (HPIT-GADSC) 
 ming.l...@hp.com
  wrote:
 
   Hi,
  
   I am using HBase 0.98.6.
  
   I learned from this maillist before, that the recommended method 
   to 'connect' to HBase from client is to use HConnectionManager like this:
   HConnection 
   con=HConnectionManager.createConnection(configuration);
   HTableInterfacetable = 
   con.getTable(hbase_table1); Instead of
   HTableInterface table = new 
   HTable(configuration, hbase_table1);
  
   I don't quite understand the reason. I was thinking that each time 
   I initialize a HTable instance, it needs to create a new 
   HConnection. And that is expensive. But using the first method, 
   multiple HTable
 instances
   can share the same HConnection. That is quite reasonable to me.
   However, I was reading from some articles on internet that , even 
   if I
  use
   the 'new HTable(conf, tbl)' method, if the 'conf' object is the 
   same
 one,
   all the HTable instances will still share the same HConnection. I 
   was recently read yet another article and said when using 'new 
   HTable(conf, tbl)', one don't need to use the exactly same 'conf' 
   object (same one
 in
   memory). if two 'conf' objects, two different objects are all the
 same, I
   mean all attributes of these two are same (for example, created 
   from
 the
   same hbase-site.xml and never change) then HTable objects can 
   still
 share
   the same HConnection.  I also try to read the HTable src code, it 
   is
 very
   hard, but it seems to me the last statement is correct: 'HTable 
   will
  share
   HConnection, if configuration is all the same'.
  
   Sorry for so verbose. My question:
   If two 'configuration' objects are same, then two HTable object 
   instantiated with them respectively can still share the same
 HConnection
  or
   not? Directly using the 'new HTable()' method.
   If the answer is 'yes', then why I still need the 
   HConnectionManager to create a shared connection?
   I am talking about 0.98.6.
   I googled for days, and even try to read HBase src code, but still 
   get really confused. I try to do some tests also, but since I am 
   too
 newbie,
  I
   don't know how to verify the difference, I really don't know what 
   a HConnection do under the hood. I counted the ZooKeeper client 
   requests,
  and
   I found some difference. If this ZooKeeper requests difference is 
   a
  correct
   metrics, it means to me that two HTable do not share HConnetion 
   even
  using
   same 'configuration' in the constructor. So it confused me more 
   and
  more
  
   Please someone kindly help me for this newbie question and thanks 
   in advance.
  
   Thanks,
   Ming