date:20130701

Thanks Dhaval/Michael/Ted/Otis for your replies.
Actually , i asked this question because i am seeing some performance
degradation in my production Hbase setup.
I have configured Hbase in pseudo distributed mode on top of HDFS.
I have created 17 Column families :( . I am actually using 14 out of these
17 column families.
Each column family has around on average 8-10 column qualifiers so total
around 140 columns are there for each row key.
I have around 1.6 millions rows in the table.
To completely scan the table for all 140 columns , it takes around 30-40
minutes.
Is it normal or Should i redesign my table schema ( probably merging 4-5
column families into one , so that at the end i have just 3-4 cf ) ?

On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:

Hm, works for me -

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42

Otis
--
Solr ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm

On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com wrote:
Hi All ,
Thanks for your replies.

Ted,
Thanks for the link, but its not working . :(

On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu yuzhih...@gmail.com wrote:

Vimal:
Please also refer to:

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel
michael_se...@hotmail.com
wrote:

Short answer... As few as possible.

14 CF doesn't make too much sense.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 28, 2013, at 12:20 AM, Vimal Jain vkj...@gmail.com wrote:

Hi,
How many column families should be there in an hbase table ? Is
there
any
performance issue in read/write if we have more column families ?
I have designed one table with around 14 column families in it with
each
having on average 6 qualifiers.
Is it a good design ?

--
Thanks and Regards,
Vimal Jain

Re: How many column families in one table ?

2013-07-01 Thread Viral Bajaria

When you did the scan, did you check what the bottleneck was ? Was it I/O ?
Did you see any GC locks ? How much RAM are you giving to your RS ?

-Viral

On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

 To completely scan the table for all 140 columns  , it takes around 30-40
 minutes.

Re: How many column families in one table ?

I scanned it during normal traffic hours.There was no I/O load on the
server.
I dont see any GC locks too.
Also i have given 1.5G to RS , 512M to each Master and Zookeeper.

One correction in the post above :
Actual time to scan whole table is even more , it takes 10 mins to scan 0.1
million rows ( so total of 2.5 hours to scan 1.6 million rows) .
The time i mentioned in previous post was for different type of
lookup.Please ignore that.


On Mon, Jul 1, 2013 at 2:24 PM, Viral Bajaria viral.baja...@gmail.comwrote:

 When you did the scan, did you check what the bottleneck was ? Was it I/O ?
 Did you see any GC locks ? How much RAM are you giving to your RS ?

 -Viral

 On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

  To completely scan the table for all 140 columns  , it takes around 30-40
  minutes.
 




-- 
Thanks and Regards,
Vimal Jain

Re: lzo lib missing ,region server can not start

Please take a look at http://hbase.apache.org/book.html#lzo.compression and
the links in that section.

Cheers

On Mon, Jul 1, 2013 at 3:57 PM, ch huang justlo...@gmail.com wrote:

 i add lzo compression in config file ,but region server can not start,it
 seems lzo lib is miss,how can i install lzo lib for hbase,and in production
 which compress is used ? snappy or lzo?
 thanks all


 # /etc/init.d/hadoop-hbase-regionserver start
 starting regionserver, logging to
 /var/log/hbase/hbase-hbase-regionserver-CH34.out
 Exception in thread main java.lang.RuntimeException: Failed construction
 of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer
 at

 org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2805)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:60)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at

 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2829)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at

 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 [root@CH34 ~]# less /var/log/hbase/hbase-hbase-regionserver-CH34.out
 Exception in thread main java.lang.RuntimeException: Failed construction
 of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer
 at

 org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2805)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:60)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at

 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2829)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at

 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at

 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at

 org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2803)
 ... 5 more
 Caused by: java.io.IOException: Compression codec lzo not supported,
 aborting RS construction
 at

 org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:295)
 ... 10 more

 # hbase org.apache.hadoop.hbase.util.CompressionTest
 file:///root/jdk-6u35-linux-amd64.rpm lzo
 13/07/01 15:45:05 INFO util.NativeCodeLoader: Loaded the native-hadoop
 library
 Exception in thread main java.lang.RuntimeException:
 java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec
 at

 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:110)
 at

 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:234)
 at

 org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFile.java:397)
 at
 org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:383)
 at

 org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile.java:354)
 at
 org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:536)
 at
 org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:515)
 at

 org.apache.hadoop.hbase.util.CompressionTest.doSmokeTest(CompressionTest.java:108)
 at
 org.apache.hadoop.hbase.util.CompressionTest.main(CompressionTest.java:134)
 Caused by: java.lang.ClassNotFoundException:
 com.hadoop.compression.lzo.LzoCodec
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 at

 org.apache.hadoop.hbase.io.hfile.Compression$Algorithm$1.getCodec(Compression.java:105)
 ... 8 more

HBASE-7846 : is it safe to use on 0.94.4 ?

2013-07-01 Thread Viral Bajaria

Hi,

Just wanted to check if it's safe to use the JIRA mentioned in the subject
i.e. https://issues.apache.org/jira/browse/HBASE-7846

Thanks,
Viral

Re: Issues with delete markers

That would be quite dramatic change, we cannot pass delete markers to the 
existing filters without confusing them.
We could invent a new method (filterDeleteKV or filterDeleteMarker or 
something) on filters along with a new filter type that implements that 
method.


-- Lars


- Original Message -
From: Varun Sharma va...@pinterest.com
To: d...@hbase.apache.org d...@hbase.apache.org; user@hbase.apache.org
Cc: 
Sent: Sunday, June 30, 2013 1:56 PM
Subject: Re: Issues with delete markers

Sorry, typo, i meant that for user scans, should we be passing delete
markers through.the filters as well ?

Varun


On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma va...@pinterest.com wrote:

 For user scans, i feel we should be passing delete markers through as well.


 On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma va...@pinterest.comwrote:

 I tried this a little bit and it seems that filters are not called on
 delete markers. For raw scans returning delete markers, does it make sense
 to do that ?

 Varun


 On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma va...@pinterest.comwrote:

 Hi,

 We are having an issue with the way HBase does handling of deletes. We
 are looking to retrieve 300 columns in a row but the row has tens of
 thousands of delete markers in it before we span the 300 columns something
 like this


 row  DeleteCol1 Col1  DeleteCol2 Col2 ... DeleteCol3 Col3

 And so on. Therefore, the issue here, being that to retrieve these 300
 columns, we need to go through tens of thousands of deletes - sometimes we
 get a spurt of these queries and that DDoSes a region server. We are okay
 with saying, only return first 300 columns and stop once you encounter, say
 5K column delete markers or something.

 I wonder if such a construct is provided by HBase or do we need to build
 something on top of the RAW scan and handle the delete masking there.

 Thanks
 Varun

Re: Poor HBase map-reduce scan performance

Absolutely.

- Original Message -
From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org
Cc: 
Sent: Sunday, June 30, 2013 9:32 PM
Subject: Re: Poor HBase map-reduce scan performance

Looking at the tail of HBASE-8369, there were some comments which are yet
to be addressed.

I think trunk patch should be finalized before backporting.

Cheers

On Mon, Jul 1, 2013 at 12:23 PM, Bryan Keller brya...@gmail.com wrote:

 I'll attach my patch to HBASE-8369 tomorrow.

 On Jun 28, 2013, at 10:56 AM, lars hofhansl la...@apache.org wrote:

  If we can make a clean patch with minimal impact to existing code I
 would be supportive of a backport to 0.94.

  -- Lars

  - Original Message -
  From: Bryan Keller brya...@gmail.com
  To: user@hbase.apache.org; lars hofhansl la...@apache.org
  Cc:
  Sent: Tuesday, June 25, 2013 1:56 AM
  Subject: Re: Poor HBase map-reduce scan performance

  I tweaked Enis's snapshot input format and backported it to 0.94.6 and
 have snapshot scanning functional on my system. Performance is dramatically
 better, as expected i suppose. I'm seeing about 3.6x faster performance vs
 TableInputFormat. Also, HBase doesn't get bogged down during a scan as the
 regionserver is being bypassed. I'm very excited by this. There are some
 issues with file permissions and library dependencies but nothing that
 can't be worked out.

  On Jun 5, 2013, at 6:03 PM, lars hofhansl la...@apache.org wrote:

  That's exactly the kind of pre-fetching I was investigating a bit ago
 (made a patch, but ran out of time).
  This pre-fetching is strictly client only, where the client keeps the
 server busy while it is processing the previous batch, but filling up a 2nd
 buffer.

  -- Lars

  From: Sandy Pratt prat...@adobe.com
  To: user@hbase.apache.org user@hbase.apache.org
  Sent: Wednesday, June 5, 2013 10:58 AM
  Subject: Re: Poor HBase map-reduce scan performance

  Yong,

  As a thought experiment, imagine how it impacts the throughput of TCP to
  keep the window size at 1.  That means there's only one packet in flight
  at a time, and total throughput is a fraction of what it could be.

  That's effectively what happens with RPC.  The server sends a batch,
 then
  does nothing while it waits for the client to ask for more.  During that
  time, the pipe between them is empty.  Increasing the batch size can
 help
  a bit, in essence creating a really huge packet, but the problem
 remains.
  There will always be stalls in the pipe.

  What you want is for the window size to be large enough that the pipe is
  saturated.  A streaming API accomplishes that by stuffing data down the
  network pipe as quickly as possible.

  Sandy

  On 6/5/13 7:55 AM, yonghu yongyong...@gmail.com wrote:

  Can anyone explain why client + rpc + server will decrease the
 performance
  of scanning? I mean the Regionserver and Tasktracker are the same node
  when
  you use MapReduce to scan the HBase table. So, in my understanding,
 there
  will be no rpc cost.

  Thanks!

  Yong

  On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt prat...@adobe.com
 wrote:

  https://issues.apache.org/jira/browse/HBASE-8691

  On 6/4/13 6:11 PM, Sandy Pratt prat...@adobe.com wrote:

  Haven't had a chance to write a JIRA yet, but I thought I'd pop in
 here
  with an update in the meantime.

  I tried a number of different approaches to eliminate latency and
  bubbles in the scan pipeline, and eventually arrived at adding a
  streaming scan API to the region server, along with refactoring the
  scan
  interface into an event-drive message receiver interface.  In so
  doing, I
  was able to take scan speed on my cluster from 59,537 records/sec
 with
  the
  classic scanner to 222,703 records per second with my new scan API.
  Needless to say, I'm pleased ;)

  More details forthcoming when I get a chance.

  Thanks,
  Sandy

  On 5/23/13 3:47 PM, Ted Yu yuzhih...@gmail.com wrote:

  Thanks for the update, Sandy.

  If you can open a JIRA and attach your producer / consumer scanner
  there,
  that would be great.

  On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt prat...@adobe.com
  wrote:

  I wrote myself a Scanner wrapper that uses a producer/consumer
  queue to
  keep the client fed with a full buffer as much as possible.  When
  scanning
  my table with scanner caching at 100 records, I see about a 24%
  uplift
  in
  performance (~35k records/sec with the ClientScanner and ~44k
  records/sec
  with my P/C scanner).  However, when I set scanner caching to 5000,
  it's
  more of a wash compared to the standard ClientScanner: ~53k
  records/sec
  with the ClientScanner and ~60k records/sec with the P/C scanner.

  I'm not sure what to make of those results.  I think next I'll shut
  down
  HBase and read the HFiles directly, to see if there's a drop off in
  performance between reading them directly vs. via the RegionServer.

  I still think that to really solve

Re: How many column families in one table ?

Can someone please reply ?
Also what is  the typical read/write speed of hbase and how much deviation
would be there in my scenario mentioned above (14 cf , total 140 columns ) ?
I am asking this because i am not simply printing out the scanned values ,
instead i am applying some logic on the data retrieved per row basis. So
was just curious to find if that small logic in my code is contributing
towards the long time taken to scan the table.


On Mon, Jul 1, 2013 at 2:41 PM, Vimal Jain vkj...@gmail.com wrote:

 I scanned it during normal traffic hours.There was no I/O load on the
 server.
 I dont see any GC locks too.
 Also i have given 1.5G to RS , 512M to each Master and Zookeeper.

 One correction in the post above :
 Actual time to scan whole table is even more , it takes 10 mins to scan
 0.1 million rows ( so total of 2.5 hours to scan 1.6 million rows) .
 The time i mentioned in previous post was for different type of
 lookup.Please ignore that.


 On Mon, Jul 1, 2013 at 2:24 PM, Viral Bajaria viral.baja...@gmail.comwrote:

 When you did the scan, did you check what the bottleneck was ? Was it I/O
 ?
 Did you see any GC locks ? How much RAM are you giving to your RS ?

 -Viral

 On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

  To completely scan the table for all 140 columns  , it takes around
 30-40
  minutes.
 




 --
 Thanks and Regards,
 Vimal Jain




-- 
Thanks and Regards,
Vimal Jain

Re: Behavior of Filter.transform() in FilterList?

You want transform to only be called on filters that are reached?
I.e. FilterA and FilterB, FilterB.transform should not be called if a KV is 
already filtered by FilterA?

That's not how it works right now, transform is called in a completely 
different code path from the actual filtering logic.


-- Lars



- Original Message -
From: Christophe Taton ta...@wibidata.com
To: user@hbase.apache.org
Cc: 
Sent: Sunday, June 30, 2013 10:26 PM
Subject: Re: Behavior of Filter.transform() in FilterList?

On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu yuzhih...@gmail.com wrote:

 The clause 'family=X and column=Y and KeyOnlyFilter' would be represented
 by a FilterList, right ?
 (family=A and colymn=B) would be represented by another FilterList.


Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).

So the behavior is expected.


Could you explain, I'm not sure how you reach this conclusion.
Are you saying it is expected, given the actual implementation
FilterList.transform()?
Or are there some other details I missed?

Thanks!
C.

On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton ta...@wibidata.com wrote:

  Hi,
 
  From
 
 
 https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
  ,
  it appears that Filter.transform() is invoked unconditionally on all
  filters in a FilterList hierarchy.
 
  This is quite confusing, especially since I may construct a filter like:
      (family=X and column=Y and KeyOnlyFilter) or (family=A and colymn=B)
  The KeyOnlyFilter will remove all values from the KeyValues in A:B as
 well.
 
  Is my understanding correct? Is this an expected/intended behavior?
 
  Thanks,
  C.

Re: How many column families in one table ?

Which version of HBase?
Did you enable scanner caching? Otherwise each call to next() is a RPC 
roundtrip and you are basically measuring your networks RTT.

-- Lars

 From: Vimal Jain vkj...@gmail.com
To: user@hbase.apache.org 
Sent: Monday, July 1, 2013 4:11 AM
Subject: Re: How many column families in one table ?

Can someone please reply ?
Also what is  the typical read/write speed of hbase and how much deviation
would be there in my scenario mentioned above (14 cf , total 140 columns ) ?
I am asking this because i am not simply printing out the scanned values ,
instead i am applying some logic on the data retrieved per row basis. So
was just curious to find if that small logic in my code is contributing
towards the long time taken to scan the table.

On Mon, Jul 1, 2013 at 2:41 PM, Vimal Jain vkj...@gmail.com wrote:

 I scanned it during normal traffic hours.There was no I/O load on the
 server.
 I dont see any GC locks too.
 Also i have given 1.5G to RS , 512M to each Master and Zookeeper.

 One correction in the post above :
 Actual time to scan whole table is even more , it takes 10 mins to scan
 0.1 million rows ( so total of 2.5 hours to scan 1.6 million rows) .
 The time i mentioned in previous post was for different type of
 lookup.Please ignore that.

 On Mon, Jul 1, 2013 at 2:24 PM, Viral Bajaria viral.baja...@gmail.comwrote:

 When you did the scan, did you check what the bottleneck was ? Was it I/O
 ?
 Did you see any GC locks ? How much RAM are you giving to your RS ?

 -Viral

 On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

  To completely scan the table for all 140 columns  , it takes around
 30-40
  minutes.

 --
 Thanks and Regards,
 Vimal Jain

-- 
Thanks and Regards,
Vimal Jain

Re: How many column families in one table ?

Hi Lars,
I am using Hadoop version - 1.1.2  and Hbase version - 0.94.7.
Yes , I have enabled scanner caching with value 10K but performance is not
too good. :(


On Mon, Jul 1, 2013 at 4:48 PM, lars hofhansl la...@apache.org wrote:

 Which version of HBase?
 Did you enable scanner caching? Otherwise each call to next() is a RPC
 roundtrip and you are basically measuring your networks RTT.

 -- Lars


 
  From: Vimal Jain vkj...@gmail.com
 To: user@hbase.apache.org
 Sent: Monday, July 1, 2013 4:11 AM
 Subject: Re: How many column families in one table ?


 Can someone please reply ?
 Also what is  the typical read/write speed of hbase and how much deviation
 would be there in my scenario mentioned above (14 cf , total 140 columns )
 ?
 I am asking this because i am not simply printing out the scanned values ,
 instead i am applying some logic on the data retrieved per row basis. So
 was just curious to find if that small logic in my code is contributing
 towards the long time taken to scan the table.


 On Mon, Jul 1, 2013 at 2:41 PM, Vimal Jain vkj...@gmail.com wrote:

  I scanned it during normal traffic hours.There was no I/O load on the
  server.
  I dont see any GC locks too.
  Also i have given 1.5G to RS , 512M to each Master and Zookeeper.
 
  One correction in the post above :
  Actual time to scan whole table is even more , it takes 10 mins to scan
  0.1 million rows ( so total of 2.5 hours to scan 1.6 million rows) .
  The time i mentioned in previous post was for different type of
  lookup.Please ignore that.
 
 
  On Mon, Jul 1, 2013 at 2:24 PM, Viral Bajaria viral.baja...@gmail.com
 wrote:
 
  When you did the scan, did you check what the bottleneck was ? Was it
 I/O
  ?
  Did you see any GC locks ? How much RAM are you giving to your RS ?
 
  -Viral
 
  On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:
 
   To completely scan the table for all 140 columns  , it takes around
  30-40
   minutes.
  
 
 
 
 
  --
  Thanks and Regards,
  Vimal Jain
 



 --
 Thanks and Regards,
 Vimal Jain




-- 
Thanks and Regards,
Vimal Jain

Re: How many column families in one table ?

bq. I have configured Hbase in pseudo distributed mode on top of HDFS.

What was the reason for using pseudo distributed mode in production setup ?

Cheers

On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:

Hm, works for me -

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42

Otis
--
Solr ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm

On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com wrote:
Hi All ,
Thanks for your replies.

Ted,
Thanks for the link, but its not working . :(

On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu yuzhih...@gmail.com wrote:

Vimal:
Please also refer to:

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel
michael_se...@hotmail.com
wrote:

Short answer... As few as possible.

14 CF doesn't make too much sense.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 28, 2013, at 12:20 AM, Vimal Jain vkj...@gmail.com wrote:

Hi,
How many column families should be there in an hbase table ? Is
there
any
performance issue in read/write if we have more column families ?
I have designed one table with around 14 column families in it
with
each
having on average 6 qualifiers.
Is it a good design ?

--
Thanks and Regards,
Vimal Jain

Re: How many column families in one table ?

Hi,
We had some hardware constraints along with the fact that our total data
size was in GBs.
Thats why to start with Hbase , we first began with pseudo distributed
mode and thought if required we would upgrade to fully distributed mode.

On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu yuzhih...@gmail.com wrote:

bq. I have configured Hbase in pseudo distributed mode on top of HDFS.

What was the reason for using pseudo distributed mode in production setup ?

Cheers

On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

Thanks Dhaval/Michael/Ted/Otis for your replies.
Actually , i asked this question because i am seeing some performance
degradation in my production Hbase setup.
I have configured Hbase in pseudo distributed mode on top of HDFS.
I have created 17 Column families :( . I am actually using 14 out of
these
17 column families.
Each column family has around on average 8-10 column qualifiers so total
around 140 columns are there for each row key.
I have around 1.6 millions rows in the table.
To completely scan the table for all 140 columns , it takes around 30-40
minutes.
Is it normal or Should i redesign my table schema ( probably merging 4-5
column families into one , so that at the end i have just 3-4 cf ) ?

On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:

Hm, works for me -

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42

Otis
--
Solr ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm

On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com wrote:
Hi All ,
Thanks for your replies.

Ted,
Thanks for the link, but its not working . :(

On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu yuzhih...@gmail.com wrote:

Vimal:
Please also refer to:

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel
michael_se...@hotmail.com
wrote:

Short answer... As few as possible.

14 CF doesn't make too much sense.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 28, 2013, at 12:20 AM, Vimal Jain vkj...@gmail.com
wrote:

Hi,
How many column families should be there in an hbase table ? Is
there
any
performance issue in read/write if we have more column families
?
I have designed one table with around 14 column families in it
with
each
having on average 6 qualifiers.
Is it a good design ?

--
Thanks and Regards,
Vimal Jain

Re: question about hbase envionmnet variable

Looking at bin/hbase:

# check envvars which might override default args
if [ $HBASE_HEAPSIZE !=  ]; then
  #echo run with heapsize $HBASE_HEAPSIZE
  JAVA_HEAP_MAX=-Xmx$HBASE_HEAPSIZEm
  #echo $JAVA_HEAP_MAX
fi

Meaning, if you set HBASE_HEAPSIZE environment variable, bin/hbase would
take care of setting -Xmx.

Cheers

On Mon, Jul 1, 2013 at 12:11 AM, ch huang justlo...@gmail.com wrote:

 if i set HBASE_HEAPSIZE=2  (HEAP is 20G ) ,can i set jvm option -Xmx20g
 -Xms20G? if not ,how much i can set?

Re: Issues with delete markers

2013-07-01 Thread Varun Sharma

So, yesterday, I implemented this change via a coprocessor which basically
initiates a scan which is raw, keeps tracking of # of delete markers
encountered and stops when a configured threshold is met. It instantiates
its own ScanDeleteTracker to do the masking through delete markers. So raw
scan, count delete markers/stop if too many encountered and mask them so to
return sane stuff back to the client.

I guess until now it has been working reasonably. Also, with HBase 8809,
version tracking etc. should also work with filters now.


On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl la...@apache.org wrote:

 That would be quite dramatic change, we cannot pass delete markers to the
 existing filters without confusing them.
 We could invent a new method (filterDeleteKV or filterDeleteMarker or
 something) on filters along with a new filter type that implements that
 method.


 -- Lars


 - Original Message -
 From: Varun Sharma va...@pinterest.com
 To: d...@hbase.apache.org d...@hbase.apache.org; user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 1:56 PM
 Subject: Re: Issues with delete markers

 Sorry, typo, i meant that for user scans, should we be passing delete
 markers through.the filters as well ?

 Varun


 On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma va...@pinterest.com wrote:

  For user scans, i feel we should be passing delete markers through as
 well.
 
 
  On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma va...@pinterest.com
 wrote:
 
  I tried this a little bit and it seems that filters are not called on
  delete markers. For raw scans returning delete markers, does it make
 sense
  to do that ?
 
  Varun
 
 
  On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma va...@pinterest.com
 wrote:
 
  Hi,
 
  We are having an issue with the way HBase does handling of deletes. We
  are looking to retrieve 300 columns in a row but the row has tens of
  thousands of delete markers in it before we span the 300 columns
 something
  like this
 
 
  row  DeleteCol1 Col1  DeleteCol2 Col2 ... DeleteCol3
 Col3
 
  And so on. Therefore, the issue here, being that to retrieve these 300
  columns, we need to go through tens of thousands of deletes -
 sometimes we
  get a spurt of these queries and that DDoSes a region server. We are
 okay
  with saying, only return first 300 columns and stop once you
 encounter, say
  5K column delete markers or something.
 
  I wonder if such a construct is provided by HBase or do we need to
 build
  something on top of the RAW scan and handle the delete masking there.
 
  Thanks
  Varun

Re: Issues with delete markers

2013-07-01 Thread Varun Sharma

I mean version tracking with delete markers...


On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma va...@pinterest.com wrote:

 So, yesterday, I implemented this change via a coprocessor which basically
 initiates a scan which is raw, keeps tracking of # of delete markers
 encountered and stops when a configured threshold is met. It instantiates
 its own ScanDeleteTracker to do the masking through delete markers. So raw
 scan, count delete markers/stop if too many encountered and mask them so to
 return sane stuff back to the client.

 I guess until now it has been working reasonably. Also, with HBase 8809,
 version tracking etc. should also work with filters now.


 On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl la...@apache.org wrote:

 That would be quite dramatic change, we cannot pass delete markers to the
 existing filters without confusing them.
 We could invent a new method (filterDeleteKV or filterDeleteMarker or
 something) on filters along with a new filter type that implements that
 method.


 -- Lars


 - Original Message -
 From: Varun Sharma va...@pinterest.com
 To: d...@hbase.apache.org d...@hbase.apache.org; user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 1:56 PM
 Subject: Re: Issues with delete markers

 Sorry, typo, i meant that for user scans, should we be passing delete
 markers through.the filters as well ?

 Varun


 On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma va...@pinterest.com
 wrote:

  For user scans, i feel we should be passing delete markers through as
 well.
 
 
  On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma va...@pinterest.com
 wrote:
 
  I tried this a little bit and it seems that filters are not called on
  delete markers. For raw scans returning delete markers, does it make
 sense
  to do that ?
 
  Varun
 
 
  On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma va...@pinterest.com
 wrote:
 
  Hi,
 
  We are having an issue with the way HBase does handling of deletes. We
  are looking to retrieve 300 columns in a row but the row has tens of
  thousands of delete markers in it before we span the 300 columns
 something
  like this
 
 
  row  DeleteCol1 Col1  DeleteCol2 Col2 ... DeleteCol3
 Col3
 
  And so on. Therefore, the issue here, being that to retrieve these 300
  columns, we need to go through tens of thousands of deletes -
 sometimes we
  get a spurt of these queries and that DDoSes a region server. We are
 okay
  with saying, only return first 300 columns and stop once you
 encounter, say
  5K column delete markers or something.
 
  I wonder if such a construct is provided by HBase or do we need to
 build
  something on top of the RAW scan and handle the delete masking there.
 
  Thanks
  Varun

Re: Issues with delete markers

That is the easy part :)
The hard part is to add this to filters in a backwards compatible way.

-- Lars

- Original Message -
From: Varun Sharma va...@pinterest.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org
Cc: d...@hbase.apache.org d...@hbase.apache.org
Sent: Monday, July 1, 2013 8:18 AM
Subject: Re: Issues with delete markers

I mean version tracking with delete markers...

On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma va...@pinterest.com wrote:

 So, yesterday, I implemented this change via a coprocessor which basically
 initiates a scan which is raw, keeps tracking of # of delete markers
 encountered and stops when a configured threshold is met. It instantiates
 its own ScanDeleteTracker to do the masking through delete markers. So raw
 scan, count delete markers/stop if too many encountered and mask them so to
 return sane stuff back to the client.

 I guess until now it has been working reasonably. Also, with HBase 8809,
 version tracking etc. should also work with filters now.

 On Mon, Jul 1, 2013 at 3:58 AM, lars hofhansl la...@apache.org wrote:

 That would be quite dramatic change, we cannot pass delete markers to the
 existing filters without confusing them.
 We could invent a new method (filterDeleteKV or filterDeleteMarker or
 something) on filters along with a new filter type that implements that
 method.

 -- Lars

 - Original Message -
 From: Varun Sharma va...@pinterest.com
 To: d...@hbase.apache.org d...@hbase.apache.org; user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 1:56 PM
 Subject: Re: Issues with delete markers

 Sorry, typo, i meant that for user scans, should we be passing delete
 markers through.the filters as well ?

 Varun

 On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma va...@pinterest.com
 wrote:

  For user scans, i feel we should be passing delete markers through as
 well.

  On Sun, Jun 30, 2013 at 12:35 PM, Varun Sharma va...@pinterest.com
 wrote:

  I tried this a little bit and it seems that filters are not called on
  delete markers. For raw scans returning delete markers, does it make
 sense
  to do that ?

  Varun

  On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma va...@pinterest.com
 wrote:

  Hi,

  We are having an issue with the way HBase does handling of deletes. We
  are looking to retrieve 300 columns in a row but the row has tens of
  thousands of delete markers in it before we span the 300 columns
 something
  like this

  row  DeleteCol1 Col1  DeleteCol2 Col2 ... DeleteCol3
 Col3

  And so on. Therefore, the issue here, being that to retrieve these 300
  columns, we need to go through tens of thousands of deletes -
 sometimes we
  get a spurt of these queries and that DDoSes a region server. We are
 okay
  with saying, only return first 300 columns and stop once you
 encounter, say
  5K column delete markers or something.

  I wonder if such a construct is provided by HBase or do we need to
 build
  something on top of the RAW scan and handle the delete masking there.

  Thanks
  Varun

Re: How many column families in one table ?

The performance you're seeing is definitely not typical. 'couple of further 
questions:
- How large are your KVs (columns)?- Do you delete data? Do you run major 
compactions?
- Can you measure: CPU, IO, context switches, etc, during the scanning?
- Do you have many versions of the columns?


Note that HBase is a key value store, i.e. the storage is sparse. Each column 
is represented by its own key value pair, and HBase has to do the work to 
reassemble the data.


-- Lars




 From: Vimal Jain vkj...@gmail.com
To: user@hbase.apache.org 
Sent: Monday, July 1, 2013 4:44 AM
Subject: Re: How many column families in one table ?
 

Hi,
We had some hardware constraints along with the fact that our total data
size was in GBs.
Thats why to start with Hbase ,  we first began  with pseudo distributed
mode and thought if required we would upgrade to fully distributed mode.



On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu yuzhih...@gmail.com wrote:

 bq. I have configured Hbase in pseudo distributed mode on top of HDFS.

 What was the reason for using pseudo distributed mode in production setup ?

 Cheers

 On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

  Thanks Dhaval/Michael/Ted/Otis for your replies.
  Actually , i asked this question because i am seeing some performance
  degradation in my production Hbase setup.
  I have configured Hbase in pseudo distributed mode on top of HDFS.
  I have created 17 Column families :( . I am actually using 14 out of
 these
  17 column families.
  Each column family has around on average 8-10 column qualifiers so total
  around 140 columns are there for each row key.
  I have around 1.6 millions rows in the table.
  To completely scan the table for all 140 columns  , it takes around 30-40
  minutes.
  Is it normal or Should i redesign my table schema ( probably merging 4-5
  column families into one , so that at the end i have just 3-4 cf ) ?
 
 
 
  On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic 
  otis.gospodne...@gmail.com wrote:
 
   Hm, works for me -
  
  
 
 http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning
  
   Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42
  
   Otis
   --
   Solr  ElasticSearch Support -- http://sematext.com/
   Performance Monitoring -- http://sematext.com/spm
  
  
  
   On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com wrote:
Hi All ,
Thanks for your replies.
   
Ted,
Thanks for the link, but its not working . :(
   
   
On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu yuzhih...@gmail.com wrote:
   
Vimal:
Please also refer to:
   
   
  
 
 http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning
   
On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel 
   michael_se...@hotmail.com
wrote:
   
 Short answer... As few as possible.

 14 CF doesn't make too much sense.

 Sent from a remote device. Please excuse any typos...

 Mike Segel

 On Jun 28, 2013, at 12:20 AM, Vimal Jain vkj...@gmail.com
 wrote:

  Hi,
  How many column families should be there in an hbase table ? Is
   there
any
  performance issue in read/write if we have more column families
 ?
  I have designed one table with around 14 column families in it
  with
each
  having on average 6 qualifiers.
  Is it a good design ?
 
  --
  Thanks and Regards,
  Vimal Jain

   
   
   
   
--
Thanks and Regards,
Vimal Jain
  
 
 
 
  --
  Thanks and Regards,
  Vimal Jain
 




-- 
Thanks and Regards,
Vimal Jain

Re: How many column families in one table ?

Hi Lars,
1)I have around 140 columns for each row , out of 140 , around 100 rows are
holds java primitive data type , remaining 40 rows contains serialized java
object as byte array. Yes , I do delete data but the frequency is very less
( 1 out of 5K operations ). I dont run any compaction.
2) I had ran scan keeping in mind the CPU,IO and other system related
parameters.I found them to be normal with system load being 0.1-0.3.
3) Yes i have 3 versions of cell ( default value).

On Mon, Jul 1, 2013 at 9:08 PM, lars hofhansl la...@apache.org wrote:

The performance you're seeing is definitely not typical. 'couple of
further questions:
- How large are your KVs (columns)?- Do you delete data? Do you run major
compactions?
- Can you measure: CPU, IO, context switches, etc, during the scanning?
- Do you have many versions of the columns?

Note that HBase is a key value store, i.e. the storage is sparse. Each
column is represented by its own key value pair, and HBase has to do the
work to reassemble the data.

-- Lars

From: Vimal Jain vkj...@gmail.com
To: user@hbase.apache.org
Sent: Monday, July 1, 2013 4:44 AM
Subject: Re: How many column families in one table ?

On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu yuzhih...@gmail.com wrote:

bq. I have configured Hbase in pseudo distributed mode on top of HDFS.

What was the reason for using pseudo distributed mode in production
setup ?

Cheers

On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain vkj...@gmail.com wrote:

Thanks Dhaval/Michael/Ted/Otis for your replies.
Actually , i asked this question because i am seeing some performance
degradation in my production Hbase setup.
I have configured Hbase in pseudo distributed mode on top of HDFS.
I have created 17 Column families :( . I am actually using 14 out of
these
17 column families.
Each column family has around on average 8-10 column qualifiers so
total
around 140 columns are there for each row key.
I have around 1.6 millions rows in the table.
To completely scan the table for all 140 columns , it takes around
30-40
minutes.
Is it normal or Should i redesign my table schema ( probably merging
4-5
column families into one , so that at the end i have just 3-4 cf ) ?

On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:

Hm, works for me -

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42

Otis
--
Solr ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm

On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com
wrote:
Hi All ,
Thanks for your replies.

Ted,
Thanks for the link, but its not working . :(

On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu yuzhih...@gmail.com
wrote:

Vimal:
Please also refer to:

http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fbsubj=Re+HBase+Column+Family+Limit+Reasoning

On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel
michael_se...@hotmail.com
wrote:

Short answer... As few as possible.

14 CF doesn't make too much sense.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 28, 2013, at 12:20 AM, Vimal Jain vkj...@gmail.com
wrote:

Hi,
How many column families should be there in an hbase table ?
Is
there
any
performance issue in read/write if we have more column
families
?
I have designed one table with around 14 column families in it
with
each
having on average 6 qualifiers.
Is it a good design ?

--
Thanks and Regards,
Vimal Jain

BigSecret: A secure data management framework for Key-Value Stores

2013-07-01 Thread erman pattuk

My name is Erman Pattuk. Together with my advisor Prof. Murat Kantarcioglu, we 
have developed an open source tool that enables secure and encrypted 
outsourcing of Key-Value stores to public cloud infrastructures. I would like 
to get feedback from interested users, so that I can improve and strengthen the 
project. 

When you need to outsource your data to a public cloud, there are potential 
privacy and security risks. Especially, if your data consists of highly private 
information, such as social security numbers or health records, it should be 
encrypted prior to outsourcing. However doing so complicates data processing. 
Thus, intelligent solutions need to be created that (i) protects the privacy 
outsourced data, and (ii) enables efficient processing of the outsourced data. 

Our framework, BigSecret, behaves as a middleware between the clients (entities 
that want to process queries), and Key-Value stores (may be public or private). 
It is scalable, in the sense that multiple independent copies of the 
application can be executed; and it provides formally proven security. 

Initially, we have implemented BigSecret to support HBase. We have created a 
simple library that supports basic operations, such as Put, Get, Delete, Scan, 
createTable over encrypted key-value pairs. It's still in its infancy, but we 
aim to improve it over time, and add support for multiple Key-Value Store 
implementations. 

You can access: 
- Source code:https://github.com/ermanpattuk/BigSecret 
- Technical report:http://www.utdallas.edu/~exp111430/techReport.pdf 

Our paper has been accepted in the prestigious IEEE Cloud 2013 conference. You 
can get a copy of technical report via the above link. 

Best Regards, 

Erman Pattuk

Re: Behavior of Filter.transform() in FilterList?

It would make sense, but it is not immediately clear how to do so cleanly. We 
would no longer be able to call transform at the StoreScanner level (or 
evaluate the filter multiple times, or require the filters to maintain their - 
last - state and only apply transform selectively).

I added transform() a while ago in order to allow a Filter *not* to transform. 
Before each we defensively made a copy of the key, just in case a Filter (such 
as KeyOnlyFilter) would modify it, now this is a formalized, and the filter is 
responsible for making a copy only when needed.


-- Lars




 From: Christophe Taton ta...@wibidata.com
To: user@hbase.apache.org; lars hofhansl la...@apache.org 
Sent: Monday, July 1, 2013 10:27 AM
Subject: Re: Behavior of Filter.transform() in FilterList?
 


On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl la...@apache.org wrote:

You want transform to only be called on filters that are reached?
I.e. FilterA and FilterB, FilterB.transform should not be called if a KV is 
already filtered by FilterA?


Yes, that's what I naively expected, at first.

That's not how it works right now, transform is called in a completely 
different code path from the actual filtering logic.


Indeed, I just learned that.
I found no documentation of this behavior, did I miss it?
In particular, the javadoc of the workflow of Filter doesn't mention 
transform() at all.
Would it make sense to apply transform() only if the return code for 
filterKeyValue() includes the KeyValue?

C.

-- Lars


- Original Message -
From: Christophe Taton ta...@wibidata.com
To: user@hbase.apache.org
Cc:
Sent: Sunday, June 30, 2013 10:26 PM
Subject: Re: Behavior of Filter.transform() in FilterList?

On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu yuzhih...@gmail.com wrote:

 The clause 'family=X and column=Y and KeyOnlyFilter' would be represented
 by a FilterList, right ?
 (family=A and colymn=B) would be represented by another FilterList.


Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).

So the behavior is expected.


Could you explain, I'm not sure how you reach this conclusion.
Are you saying it is expected, given the actual implementation
FilterList.transform()?
Or are there some other details I missed?

Thanks!
C.

On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton ta...@wibidata.com wrote:

  Hi,
 
  From
 
 
 https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
  ,
  it appears that Filter.transform() is invoked unconditionally on all
  filters in a FilterList hierarchy.
 
  This is quite confusing, especially since I may construct a filter like:
      (family=X and column=Y and KeyOnlyFilter) or (family=A and colymn=B)
  The KeyOnlyFilter will remove all values from the KeyValues in A:B as
 well.
 
  Is my understanding correct? Is this an expected/intended behavior?
 
  Thanks,
  C.

Re: How many column families in one table ?

2013-07-01 Thread Viral Bajaria

On Mon, Jul 1, 2013 at 10:06 AM, Vimal Jain vkj...@gmail.com wrote:

 Sorry for the typo .. please ignore previous mail.. Here is the corrected
 one..
 1)I have around 140 columns for each row , out of 140 , around 100 columns
 hold java primitive data type , remaining 40 columns  contain serialized
 java object as byte array(Inside each object is an ArrayList). Yes , I do
 delete data but the frequency is very less ( 1 out of 5K operations ). I
 dont run any compaction.


This answers the type of data in each cell not the size of data. Can you
figure out the average size of data that you insert in that size. For
example what is the length of the byte array ? Also for java primitive, is
it 8-byte long ? 4-byte int ?
In addition to that, what is in the row key ? How long is that in bytes ?
Same for column family, can you share the names of the column family ? How
about qualifiers ?

If you have disabled major compactions, you should run it once a few days
(if not once a day) to consolidate the # of files that each scan will have
to open.

2) I had ran scan keeping in mind the CPU,IO and other system related
 parameters.I found them to be normal with system load being 0.1-0.3.


How many disks do you have in your box ? Have you ever benchmarked the
hardware ?

Thanks,
Viral

simple export -- bulk import

2013-07-01 Thread Michael Ellery

I'm currently struggling with export/import between two hbase clusters. 
I have managed to create incremental exports from the source cluster
(using hbase Export). Now I would like to bulk load the export into the
destination (presumably using HFiles).  The reason for the bulk load
requirement is that the destination cluster is NOT tuned for individual
puts (which is what the default import does).

I've tried importtsv, but it seems to get confused by the exported data
and I end-up with incorrect data in the destination.

Has anyone successfully used export + import with a bulkload at the
destination?  If not, are there other utils I should consider using for
this use-case?

Thanks,
Mike Ellery

stop_replication dangerous?

2013-07-01 Thread Patrick Schless

The first two tutorials for enabling replication that google gives me [1],
[2] take very different tones with regard to stop_replication. The HBase
docs [1] make it sound fine to start and stop replication as desired. The
Cloudera docs [2] say it may cause data loss.

Which is true? If data loss is possible, are we talking about data loss in
the primary cluster, or data loss in the standby cluster (presumably would
require reinitializing the sync with a new CopyTable).

Thanks,
Patrick


[1]
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements
[2]
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_20_11.html

Re: stop_replication dangerous?

2013-07-01 Thread Jean-Daniel Cryans

Yeah that package documentation ought to be changed. Mind opening a jira?

Thx,

J-D

On Mon, Jul 1, 2013 at 1:51 PM, Patrick Schless
patrick.schl...@gmail.com wrote:
 The first two tutorials for enabling replication that google gives me [1],
 [2] take very different tones with regard to stop_replication. The HBase
 docs [1] make it sound fine to start and stop replication as desired. The
 Cloudera docs [2] say it may cause data loss.

 Which is true? If data loss is possible, are we talking about data loss in
 the primary cluster, or data loss in the standby cluster (presumably would
 require reinitializing the sync with a new CopyTable).

 Thanks,
 Patrick


 [1]
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements
 [2]
 http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_20_11.html

Re: Poor HBase map-reduce scan performance

2013-07-01 Thread Enis Söztutar

Bryan,

3.6x improvement seems exciting. The ballpark difference between HBase scan
and hdfs scan is in that order, so it is expected I guess.

I plan to get back to the trunk patch, add more tests etc next week. In the
mean time, if you have any changes to the patch, pls attach the patch.

Enis


On Mon, Jul 1, 2013 at 3:59 AM, lars hofhansl la...@apache.org wrote:

 Absolutely.



 - Original Message -
 From: Ted Yu yuzhih...@gmail.com
 To: user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 9:32 PM
 Subject: Re: Poor HBase map-reduce scan performance

 Looking at the tail of HBASE-8369, there were some comments which are yet
 to be addressed.

 I think trunk patch should be finalized before backporting.

 Cheers

 On Mon, Jul 1, 2013 at 12:23 PM, Bryan Keller brya...@gmail.com wrote:

  I'll attach my patch to HBASE-8369 tomorrow.
 
  On Jun 28, 2013, at 10:56 AM, lars hofhansl la...@apache.org wrote:
 
   If we can make a clean patch with minimal impact to existing code I
  would be supportive of a backport to 0.94.
  
   -- Lars
  
  
  
   - Original Message -
   From: Bryan Keller brya...@gmail.com
   To: user@hbase.apache.org; lars hofhansl la...@apache.org
   Cc:
   Sent: Tuesday, June 25, 2013 1:56 AM
   Subject: Re: Poor HBase map-reduce scan performance
  
   I tweaked Enis's snapshot input format and backported it to 0.94.6 and
  have snapshot scanning functional on my system. Performance is
 dramatically
  better, as expected i suppose. I'm seeing about 3.6x faster performance
 vs
  TableInputFormat. Also, HBase doesn't get bogged down during a scan as
 the
  regionserver is being bypassed. I'm very excited by this. There are some
  issues with file permissions and library dependencies but nothing that
  can't be worked out.
  
   On Jun 5, 2013, at 6:03 PM, lars hofhansl la...@apache.org wrote:
  
   That's exactly the kind of pre-fetching I was investigating a bit ago
  (made a patch, but ran out of time).
   This pre-fetching is strictly client only, where the client keeps the
  server busy while it is processing the previous batch, but filling up a
 2nd
  buffer.
  
  
   -- Lars
  
  
  
   
   From: Sandy Pratt prat...@adobe.com
   To: user@hbase.apache.org user@hbase.apache.org
   Sent: Wednesday, June 5, 2013 10:58 AM
   Subject: Re: Poor HBase map-reduce scan performance
  
  
   Yong,
  
   As a thought experiment, imagine how it impacts the throughput of TCP
 to
   keep the window size at 1.  That means there's only one packet in
 flight
   at a time, and total throughput is a fraction of what it could be.
  
   That's effectively what happens with RPC.  The server sends a batch,
  then
   does nothing while it waits for the client to ask for more.  During
 that
   time, the pipe between them is empty.  Increasing the batch size can
  help
   a bit, in essence creating a really huge packet, but the problem
  remains.
   There will always be stalls in the pipe.
  
   What you want is for the window size to be large enough that the pipe
 is
   saturated.  A streaming API accomplishes that by stuffing data down
 the
   network pipe as quickly as possible.
  
   Sandy
  
   On 6/5/13 7:55 AM, yonghu yongyong...@gmail.com wrote:
  
   Can anyone explain why client + rpc + server will decrease the
  performance
   of scanning? I mean the Regionserver and Tasktracker are the same
 node
   when
   you use MapReduce to scan the HBase table. So, in my understanding,
  there
   will be no rpc cost.
  
   Thanks!
  
   Yong
  
  
   On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt prat...@adobe.com
  wrote:
  
   https://issues.apache.org/jira/browse/HBASE-8691
  
  
   On 6/4/13 6:11 PM, Sandy Pratt prat...@adobe.com wrote:
  
   Haven't had a chance to write a JIRA yet, but I thought I'd pop in
  here
   with an update in the meantime.
  
   I tried a number of different approaches to eliminate latency and
   bubbles in the scan pipeline, and eventually arrived at adding a
   streaming scan API to the region server, along with refactoring the
   scan
   interface into an event-drive message receiver interface.  In so
   doing, I
   was able to take scan speed on my cluster from 59,537 records/sec
  with
   the
   classic scanner to 222,703 records per second with my new scan API.
   Needless to say, I'm pleased ;)
  
   More details forthcoming when I get a chance.
  
   Thanks,
   Sandy
  
   On 5/23/13 3:47 PM, Ted Yu yuzhih...@gmail.com wrote:
  
   Thanks for the update, Sandy.
  
   If you can open a JIRA and attach your producer / consumer scanner
   there,
   that would be great.
  
   On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt prat...@adobe.com
   wrote:
  
   I wrote myself a Scanner wrapper that uses a producer/consumer
   queue to
   keep the client fed with a full buffer as much as possible.  When
   scanning
   my table with scanner caching at 100 records, I see about a 24%
   uplift
   in
   performance (~35k records/sec with

Re: Poor HBase map-reduce scan performance

2013-07-01 Thread Bryan Keller

I attached my patch to the JIRA issue, in case anyone is interested. It can 
pretty easily be used on its own without patching HBase. I am currently doing 
this.


On Jul 1, 2013, at 2:23 PM, Enis Söztutar enis@gmail.com wrote:

 Bryan,
 
 3.6x improvement seems exciting. The ballpark difference between HBase scan
 and hdfs scan is in that order, so it is expected I guess.
 
 I plan to get back to the trunk patch, add more tests etc next week. In the
 mean time, if you have any changes to the patch, pls attach the patch.
 
 Enis
 
 
 On Mon, Jul 1, 2013 at 3:59 AM, lars hofhansl la...@apache.org wrote:
 
 Absolutely.
 
 
 
 - Original Message -
 From: Ted Yu yuzhih...@gmail.com
 To: user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 9:32 PM
 Subject: Re: Poor HBase map-reduce scan performance
 
 Looking at the tail of HBASE-8369, there were some comments which are yet
 to be addressed.
 
 I think trunk patch should be finalized before backporting.
 
 Cheers
 
 On Mon, Jul 1, 2013 at 12:23 PM, Bryan Keller brya...@gmail.com wrote:
 
 I'll attach my patch to HBASE-8369 tomorrow.
 
 On Jun 28, 2013, at 10:56 AM, lars hofhansl la...@apache.org wrote:
 
 If we can make a clean patch with minimal impact to existing code I
 would be supportive of a backport to 0.94.
 
 -- Lars
 
 
 
 - Original Message -
 From: Bryan Keller brya...@gmail.com
 To: user@hbase.apache.org; lars hofhansl la...@apache.org
 Cc:
 Sent: Tuesday, June 25, 2013 1:56 AM
 Subject: Re: Poor HBase map-reduce scan performance
 
 I tweaked Enis's snapshot input format and backported it to 0.94.6 and
 have snapshot scanning functional on my system. Performance is
 dramatically
 better, as expected i suppose. I'm seeing about 3.6x faster performance
 vs
 TableInputFormat. Also, HBase doesn't get bogged down during a scan as
 the
 regionserver is being bypassed. I'm very excited by this. There are some
 issues with file permissions and library dependencies but nothing that
 can't be worked out.
 
 On Jun 5, 2013, at 6:03 PM, lars hofhansl la...@apache.org wrote:
 
 That's exactly the kind of pre-fetching I was investigating a bit ago
 (made a patch, but ran out of time).
 This pre-fetching is strictly client only, where the client keeps the
 server busy while it is processing the previous batch, but filling up a
 2nd
 buffer.
 
 
 -- Lars
 
 
 
 
 From: Sandy Pratt prat...@adobe.com
 To: user@hbase.apache.org user@hbase.apache.org
 Sent: Wednesday, June 5, 2013 10:58 AM
 Subject: Re: Poor HBase map-reduce scan performance
 
 
 Yong,
 
 As a thought experiment, imagine how it impacts the throughput of TCP
 to
 keep the window size at 1.  That means there's only one packet in
 flight
 at a time, and total throughput is a fraction of what it could be.
 
 That's effectively what happens with RPC.  The server sends a batch,
 then
 does nothing while it waits for the client to ask for more.  During
 that
 time, the pipe between them is empty.  Increasing the batch size can
 help
 a bit, in essence creating a really huge packet, but the problem
 remains.
 There will always be stalls in the pipe.
 
 What you want is for the window size to be large enough that the pipe
 is
 saturated.  A streaming API accomplishes that by stuffing data down
 the
 network pipe as quickly as possible.
 
 Sandy
 
 On 6/5/13 7:55 AM, yonghu yongyong...@gmail.com wrote:
 
 Can anyone explain why client + rpc + server will decrease the
 performance
 of scanning? I mean the Regionserver and Tasktracker are the same
 node
 when
 you use MapReduce to scan the HBase table. So, in my understanding,
 there
 will be no rpc cost.
 
 Thanks!
 
 Yong
 
 
 On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt prat...@adobe.com
 wrote:
 
 https://issues.apache.org/jira/browse/HBASE-8691
 
 
 On 6/4/13 6:11 PM, Sandy Pratt prat...@adobe.com wrote:
 
 Haven't had a chance to write a JIRA yet, but I thought I'd pop in
 here
 with an update in the meantime.
 
 I tried a number of different approaches to eliminate latency and
 bubbles in the scan pipeline, and eventually arrived at adding a
 streaming scan API to the region server, along with refactoring the
 scan
 interface into an event-drive message receiver interface.  In so
 doing, I
 was able to take scan speed on my cluster from 59,537 records/sec
 with
 the
 classic scanner to 222,703 records per second with my new scan API.
 Needless to say, I'm pleased ;)
 
 More details forthcoming when I get a chance.
 
 Thanks,
 Sandy
 
 On 5/23/13 3:47 PM, Ted Yu yuzhih...@gmail.com wrote:
 
 Thanks for the update, Sandy.
 
 If you can open a JIRA and attach your producer / consumer scanner
 there,
 that would be great.
 
 On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt prat...@adobe.com
 wrote:
 
 I wrote myself a Scanner wrapper that uses a producer/consumer
 queue to
 keep the client fed with a full buffer as much as possible.  When
 scanning
 my table with scanner caching at 100 records, I see about a

Re: data loss after cluster wide power loss

2013-07-01 Thread Suresh Srinivas

Yes this is a known issue.

The HDFS part of this was addressed in
https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is not
available in 1.x  release. I think HBase does not use this API yet.


On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham lat...@davelink.net wrote:

 We're running HBase over HDFS 1.0.2 on about 1000 nodes.  On Saturday the
 data center we were in had a total power failure and the cluster went down
 hard.  When we brought it back up, HDFS reported 4 files as CORRUPT.  We
 recovered the data in question from our secondary datacenter, but I'm
 trying to understand what happened and whether this is a bug in HDFS that
 should be fixed.

 From what I can tell the file was created and closed by the dfs client
 (hbase).  Then HBase renamed it into a new directory and deleted some other
 files containing the same data.  Then the cluster lost power.  After the
 cluster was restarted, the datanodes reported into the namenode but the
 blocks for this file appeared as blocks being written - the namenode
 rejected them and the datanodes deleted the blocks.  At this point there
 were no replicas for the blocks and the files were marked CORRUPT.  The
 underlying file systems are ext3.  Some questions that I would love get
 answers for if anyone with deeper understanding of HDFS can chime in:

  - Is this a known scenario where data loss is expected?  (I found
 HDFS-1539 but that seems different)
  - When are blocks moved from blocksBeingWritten to current?  Does that
 happen before a file close operation is acknowledged to a hdfs client?
  - Could it be that the DataNodes actually moved the blocks to current but
 after the restart ext3 rewound state somehow (forgive my ignorance of
 underlying file system behavior)?
  - Is there any other explanation for how this can happen?

 Here is a sequence of selected relevant log lines from the RS (HBase
 Region Server) NN (NameNode) and DN (DataNode - 1 example of 3 in
 question).  It includes everything that mentions the block in question in
 the NameNode and one DataNode log.  Please let me know if this more
 information that would be helpful.

 RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils:
 Creating
 file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
 with permission=rwxrwxrwx
 NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.allocateBlock:
 /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
 blk_1395839728632046111_357084589
 DN 2013-06-29 11:16:06,832 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
 blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: /
 10.0.5.237:50010
 NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to
 blk_1395839728632046111_357084589 size 25418340
 NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to
 blk_1395839728632046111_357084589 size 25418340
 NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to
 blk_1395839728632046111_357084589 size 25418340
 DN 2013-06-29 11:16:11,385 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
 blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327
 DN 2013-06-29 11:16:11,385 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for
 block blk_1395839728632046111_357084589 terminating
 NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange:
 Removing lease on  file
 /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
 from client DFSClient_hb_rs_hs745,60020,1372470111932
 NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR*
 NameSystem.completeFile: file
 /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
 is closed by DFSClient_hb_rs_hs745,60020,1372470111932
 RS 2013-06-29 11:16:11,393 INFO
 org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at
 hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
 to
 hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c
 RS 2013-06-29 11:16:11,505 INFO
 org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 7
 file(s) in n of
 users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into
 6e0cc30af6e64e56ba5a539fdf159c4c, size=24.2m; total size for store is 24.2m

 ---  CRASH, RESTART -

 NN 2013-06-29 12:01:19,743 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.addStoredBlock: addStoredBlock request received for
 blk_1395839728632046111_357084589 on 10.0.6.1:50010

Re: stop_replication dangerous?

2013-07-01 Thread Patrick Schless

sure thing: https://issues.apache.org/jira/browse/HBASE-8844


On Mon, Jul 1, 2013 at 3:59 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:

 Yeah that package documentation ought to be changed. Mind opening a jira?

 Thx,

 J-D

 On Mon, Jul 1, 2013 at 1:51 PM, Patrick Schless
 patrick.schl...@gmail.com wrote:
  The first two tutorials for enabling replication that google gives me
 [1],
  [2] take very different tones with regard to stop_replication. The HBase
  docs [1] make it sound fine to start and stop replication as desired. The
  Cloudera docs [2] say it may cause data loss.
 
  Which is true? If data loss is possible, are we talking about data loss
 in
  the primary cluster, or data loss in the standby cluster (presumably
 would
  require reinitializing the sync with a new CopyTable).
 
  Thanks,
  Patrick
 
 
  [1]
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements
  [2]
 
 http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_20_11.html

Re: data loss after cluster wide power loss

2013-07-01 Thread Dave Latham

Thanks for the response, Suresh.

I'm not sure that I understand the details properly. From my reading of
HDFS-744 the hsync API would allow a client to make sure that at any point
in time it's writes so far hit the disk. For example, for HBase it could
apply a fsync after adding some edits to its WAL to ensure those edits are
fully durable for a file which is still open.

However, in this case the dfs file was closed and even renamed. Is it the
case that even after a dfs file is closed and renamed that the data blocks
would still not be synced and would still be stored by the datanode in
blocksBeingWritten rather than in current? If that is case, would it
be better for the NameNode not to reject replicas that are in
blocksBeingWritten, especially if it doesn't have any other replicas
available?

Dave

On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas sur...@hortonworks.comwrote:

Yes this is a known issue.

The HDFS part of this was addressed in
https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is not
available in 1.x release. I think HBase does not use this API yet.

On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham lat...@davelink.net wrote:

We're running HBase over HDFS 1.0.2 on about 1000 nodes. On Saturday the
data center we were in had a total power failure and the cluster went
down
hard. When we brought it back up, HDFS reported 4 files as CORRUPT. We
recovered the data in question from our secondary datacenter, but I'm
trying to understand what happened and whether this is a bug in HDFS that
should be fixed.

From what I can tell the file was created and closed by the dfs client
(hbase). Then HBase renamed it into a new directory and deleted some
other
files containing the same data. Then the cluster lost power. After the
cluster was restarted, the datanodes reported into the namenode but the
blocks for this file appeared as blocks being written - the namenode
rejected them and the datanodes deleted the blocks. At this point there
were no replicas for the blocks and the files were marked CORRUPT. The
underlying file systems are ext3. Some questions that I would love get
answers for if anyone with deeper understanding of HDFS can chime in:

- Is this a known scenario where data loss is expected? (I found
HDFS-1539 but that seems different)
- When are blocks moved from blocksBeingWritten to current? Does that
happen before a file close operation is acknowledged to a hdfs client?
- Could it be that the DataNodes actually moved the blocks to current
but
after the restart ext3 rewound state somehow (forgive my ignorance of
underlying file system behavior)?
- Is there any other explanation for how this can happen?

Here is a sequence of selected relevant log lines from the RS (HBase
Region Server) NN (NameNode) and DN (DataNode - 1 example of 3 in
question). It includes everything that mentions the block in question in
the NameNode and one DataNode log. Please let me know if this more
information that would be helpful.

RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils:
Creating

file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
with permission=rwxrwxrwx
NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK*
NameSystem.allocateBlock:

/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
blk_1395839728632046111_357084589
DN 2013-06-29 11:16:06,832 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: /
10.0.5.237:50010
NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to
blk_1395839728632046111_357084589 size 25418340
NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to
blk_1395839728632046111_357084589 size 25418340
NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK*
NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added
to
blk_1395839728632046111_357084589 size 25418340
DN 2013-06-29 11:16:11,385 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block
blk_1395839728632046111_357084589 of size 25418340 from /
10.0.5.237:14327
DN 2013-06-29 11:16:11,385 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for
block blk_1395839728632046111_357084589 terminating
NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange:
Removing lease on file

/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
from client DFSClient_hb_rs_hs745,60020,1372470111932
NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR*

Re: how can i improve sequence write speed?

2013-07-01 Thread Mohammad Tariq

Hello there,

   I'm sorry I didn't quite get it. What do you mean by sequence write
speed? If you are looking for ways to improve HBase writes, you might find
this useful :
http://hbase.apache.org/book/perf.writing.html

Warm Regards,
Tariq
cloudfront.blogspot.com


On Mon, Jul 1, 2013 at 9:44 AM, ch huang justlo...@gmail.com wrote:

 i deploy a hbase cluster use in product envionment,how can i improve
 sequence write speed? thanks all

Re: data loss after cluster wide power loss

2013-07-01 Thread Lars Hofhansl

HBase is interesting here, because it rewrites old data into new files. So a
power outage by default would not just lose new data but potentially old data
as well.
You can enable sync on block close in HDFS, and then at least be sure that
closed blocks (and thus files) are synced to disk physically.
I found that if that is paired with the sync behind write fadvice hint there
performance impact is minimal.

-- Lars

Dave Latham lat...@davelink.net wrote:

Thanks for the response, Suresh.

Dave

On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas sur...@hortonworks.comwrote:

Yes this is a known issue.

The HDFS part of this was addressed in
https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is not
available in 1.x release. I think HBase does not use this API yet.

On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham lat...@davelink.net wrote:

RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils:
Creating

Re: data loss after cluster wide power loss

2013-07-01 Thread Azuryy Yu

how to enable sync on block close in HDFS?

--Send from my Sony mobile.
On Jul 2, 2013 6:47 AM, Lars Hofhansl lhofha...@yahoo.com wrote:

HBase is interesting here, because it rewrites old data into new files. So
a power outage by default would not just lose new data but potentially old
data as well.
You can enable sync on block close in HDFS, and then at least be sure
that closed blocks (and thus files) are synced to disk physically.
I found that if that is paired with the sync behind write fadvice hint
there performance impact is minimal.

-- Lars

Dave Latham lat...@davelink.net wrote:

Thanks for the response, Suresh.

Dave

On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

Yes this is a known issue.

The HDFS part of this was addressed in
https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is
not
available in 1.x release. I think HBase does not use this API yet.

On Mon, Jul 1, 2013 at 3:00 PM, Dave Latham lat...@davelink.net
wrote:

We're running HBase over HDFS 1.0.2 on about 1000 nodes. On Saturday
the
data center we were in had a total power failure and the cluster went
down
hard. When we brought it back up, HDFS reported 4 files as CORRUPT.
We
recovered the data in question from our secondary datacenter, but I'm
trying to understand what happened and whether this is a bug in HDFS
that
should be fixed.

From what I can tell the file was created and closed by the dfs client
(hbase). Then HBase renamed it into a new directory and deleted some
other
files containing the same data. Then the cluster lost power. After
the
cluster was restarted, the datanodes reported into the namenode but
the
blocks for this file appeared as blocks being written - the namenode
rejected them and the datanodes deleted the blocks. At this point
there
were no replicas for the blocks and the files were marked CORRUPT.
The
underlying file systems are ext3. Some questions that I would love
get
answers for if anyone with deeper understanding of HDFS can chime in:

- Is this a known scenario where data loss is expected? (I found
HDFS-1539 but that seems different)
- When are blocks moved from blocksBeingWritten to current? Does
that
happen before a file close operation is acknowledged to a hdfs client?
- Could it be that the DataNodes actually moved the blocks to current
but
after the restart ext3 rewound state somehow (forgive my ignorance of
underlying file system behavior)?
- Is there any other explanation for how this can happen?

Here is a sequence of selected relevant log lines from the RS (HBase
Region Server) NN (NameNode) and DN (DataNode - 1 example of 3 in
question). It includes everything that mentions the block in
question in
the NameNode and one DataNode log. Please let me know if this more
information that would be helpful.

RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils:
Creating

Re: data loss after cluster wide power loss

2013-07-01 Thread Dave Latham

On Mon, Jul 1, 2013 at 4:52 PM, Azuryy Yu azury...@gmail.com wrote:

 how to enable sync on block close in HDFS?

Set dfs.datanode.synconclose to true

See https://issues.apache.org/jira/browse/HDFS-1539

Re: Behavior of Filter.transform() in FilterList?

2013-07-01 Thread Christophe Taton

On Mon, Jul 1, 2013 at 12:01 PM, lars hofhansl la...@apache.org wrote:

 It would make sense, but it is not immediately clear how to do so cleanly.
 We would no longer be able to call transform at the StoreScanner level (or
 evaluate the filter multiple times, or require the filters to maintain
 their - last - state and only apply transform selectively).


I believe this change can be implemented directly in FilterList, without
requiring other changes.
A FilterList could compute its transformed KeyValue while applying
filterKeyValue() on the filter it contains, and return the pre-computed
transformed KeyValue in FilterList.transform() if it makes sense to do so.

This means Filter.transform() is always applied immediately after a
filterKeyValue() with a return code that includes the KeyValue, and this
would be true for all filters in the hierarchy.

C.

I added transform() a while ago in order to allow a Filter *not* to
 transform. Before each we defensively made a copy of the key, just in case
 a Filter (such as KeyOnlyFilter) would modify it, now this is a formalized,
 and the filter is responsible for making a copy only when needed.


 -- Lars



 
  From: Christophe Taton ta...@wibidata.com
 To: user@hbase.apache.org; lars hofhansl la...@apache.org
 Sent: Monday, July 1, 2013 10:27 AM
 Subject: Re: Behavior of Filter.transform() in FilterList?



 On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl la...@apache.org wrote:

 You want transform to only be called on filters that are reached?
 I.e. FilterA and FilterB, FilterB.transform should not be called if a KV
 is already filtered by FilterA?
 

 Yes, that's what I naively expected, at first.

 That's not how it works right now, transform is called in a completely
 different code path from the actual filtering logic.
 

 Indeed, I just learned that.
 I found no documentation of this behavior, did I miss it?
 In particular, the javadoc of the workflow of Filter doesn't mention
 transform() at all.
 Would it make sense to apply transform() only if the return code for
 filterKeyValue() includes the KeyValue?

 C.

 -- Lars
 
 
 - Original Message -
 From: Christophe Taton ta...@wibidata.com
 To: user@hbase.apache.org
 Cc:
 Sent: Sunday, June 30, 2013 10:26 PM
 Subject: Re: Behavior of Filter.transform() in FilterList?
 
 On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu yuzhih...@gmail.com wrote:
 
  The clause 'family=X and column=Y and KeyOnlyFilter' would be
 represented
  by a FilterList, right ?
  (family=A and colymn=B) would be represented by another FilterList.
 
 
 Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
 KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).
 
 So the behavior is expected.
 
 
 Could you explain, I'm not sure how you reach this conclusion.
 Are you saying it is expected, given the actual implementation
 FilterList.transform()?
 Or are there some other details I missed?
 
 Thanks!
 C.
 
 On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton ta...@wibidata.com
 wrote:
 
   Hi,
  
   From
  
  
 
 https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
   ,
   it appears that Filter.transform() is invoked unconditionally on all
   filters in a FilterList hierarchy.
  
   This is quite confusing, especially since I may construct a filter
 like:
   (family=X and column=Y and KeyOnlyFilter) or (family=A and
 colymn=B)
   The KeyOnlyFilter will remove all values from the KeyValues in A:B as
  well.
  
   Is my understanding correct? Is this an expected/intended behavior?
  
   Thanks,
   C.

Re: Behavior of Filter.transform() in FilterList?