Ma,

A number of things come up.
First you say that your application is read-only - not sure how that translates 
to "sequential read or scan".

Second, you said that you have the following setup
- 8 AWS m1.xlarge instance, with 4 CPU cores and 16G RAM. Each region server is 
configured 10G heap size. The test HTable has 23 > regions, one hfile per 
region (just major compacted).
- DATA_BLOCK_ENCODING = FAST_DIFF
- RegionScan in a endpoint coprocessor so that no network overhead will be 
included. Average key length is 35 and average value length is 5.
- scan.setCaching(1024) scan.setMaxResultSize(5M)
- Also tried the PREFETCH_BLOCKS_ON_OPEN for the whole table, however no 
improvement was observed.
- The CPU seems to be fully utilized. May be the process of decoding FAST_DIFF 
rows is too heavy for CPU?
- There's no other resource contention when I ran the tests.
- From the hfile dump on one of your region (region has one hfile), 
entries=160988965, length=1832309188

Give this setup, some questions:
- With the above (virtualized) h/w setup, what is your target scan rate?
- In one context you say that there is no resource contention and on the other 
you say the CPU seems to be fully utilized. That seems conflicting.
- With such unique data design (tall-skinny table), how is your column named 
(how many characters) and share a describe output of the table
- Does the "scan logic" need to be in a coprocessor? If so why?
- How large is your data? What is the approximate fraction of the table data 
across each server? 
  From the single hfile sample output, would it be right to say that you have 
about 2 GB of data on each server and the table size is 8x2 = 16 GB?
- What is your block cache size?
- Have you tried consecutive runs - to ensure that data is indeed complete from 
cache? Can you verify/check that from the regionserver stats?

To help you better understand the benefits or impacts of the tuning you have 
done, I would say do a baseline test and then try out the "tuning changes" one 
at a time.

The first baseline test should be on how fast can your system process data at 
the plain server level.
So e.g. I would do a "time tar cvf /dev/null <path-of-hdfs-data-on-server>".
This will give you the upper-limit on how fast your system can read the data 
(no caching involved).
I would try to do this a couple of times - first to get the time it would take 
with disk I/O and the time without disk I/O, assuming that the data is 
completely cached.
You might have to shutdown HBase regionserver for this test to ensure that the 
file is read completely from the OS filesystem cache.

The second baseline I would do is with no other tuning in place and run the 
RowCounter utility.
I would run that with a number of tuning options, with each tuning tried on its 
own to see its impact
- small hbase regionserver heapsize (say 2 GB) and allowing more RAM for the OS 
filesystem cache
- larger hbase regionserver heapsize sizes (say 4, 6, 8, 10 GB)
- scanner caching values of 1, 10, 100, 1000, 10000

If possible, I would also try this without any block encoding and compression
Then I would try out the different block encoding and caching options, all 
individually
And finally I would try a combination of the options that seemed best - 
combining say, 2 at a time first and then more (or as you see fit)

Using RowCounter helps you to focus on the tuning parameters and not do any 
code changes.

Hope that helps....

Jayesh



From: Vladimir Rodionov [mailto:[email protected]] 
Sent: Thursday, April 21, 2016 11:11 PM
To: [email protected]
Cc: Thakrar, Jayesh <[email protected]>
Subject: Re: Rows per second for RegionScanner

Try disabling block encoding - you will get better numbers.

>>  I mean per region scan speed,

Scan performance depends on # of CPU cores, the more cores you have the more 
performance you will get. Your servers are pretty low end (4 virtual CPU cores 
is just 2 hardware cores). With 32 cores per node you will get 8x speed up 
(close to 8x). 

-Vlad


On Thu, Apr 21, 2016 at 7:22 PM, hongbin ma <[email protected]> wrote:
hi Thakrar

Thanks for your reply.

My settings for the RegionScanner Scan is

scan.setCaching(1024)
scan.setMaxResultSize(5M)

even if I change the caching to 100000 I'm still not getting any
improvements. I guess the caching works for remote scan through RPC,
however not helping too much for region side scan?

I also tried the PREFETCH_BLOCKS_ON_OPEN for the whole table, however no
improvement was observed.

I'm pursuing for pure scan-read performance optimization because our
application is sort of read-only. And I observed that even if I did no
other thing (only scanning) in my coprocessor, the scan speed is not
satisfying. The CPU seems to be fully utilized. May be the process of
decoding FAST_DIFF rows is too heavy for CPU? How many rows/second scan
speed would your expect on a normal setting? I mean per region scan speed,
not the overall scan speed counting in all regions.

thanks

On Thu, Apr 21, 2016 at 10:24 PM, Thakrar, Jayesh <
[email protected]> wrote:

> Just curious - have you set the scanner caching to some high value - say
> 1000 (or even higher in your small value case)?
>
> The parameter is hbase.client.scanner.caching
>
> You can read up on it - https://hbase.apache.org/book.html
>
> Another thing, are you just looking for pure scan-read performance
> optimization?
> Depending upon the table size you can also look into caching the table or
> not caching at all.
>
> -----Original Message-----
> From: hongbin ma [mailto:[email protected]]
> Sent: Thursday, April 21, 2016 5:04 AM
> To: [email protected]
> Subject: Rows per second for RegionScanner
>
> ​Hi, experts,
>
> I'm trying to figure out how fast hbase can scan. I'm setting up the
> RegionScan in a endpoint coprocessor so that no network overhead will be
> included. My average key length is 35 and average value length is 5.
>
> My test result is that if I warm all my interested blocks in the block
> cache, I'm only able to scan around 300,000 rows per second per region
> (with endpoint I guess it's one thread per region), so it's like getting15M
> data per second. I'm not sure if this is already an acceptable number for
> HBase. The answers from you experts might help me to decide if it's worth
> to further dig into tuning it.
>
> thanks!
>
>
>
>
>
>
> other info:
>
> My hbase cluster is on 8 AWS m1.xlarge instance, with 4 CPU cores and 16G
> RAM. Each region server is configured 10G heap size. The test HTable has 23
> regions, one hfile per region (just major compacted). There's no other
> resource contention when I ran the tests.
>
> Attached is the HFile output of one of the region hfile:
> =============================================
>  hbase  org.apache.hadoop.hbase.io.hfile.HFile -m -s -v -f
>
> /apps/hbase/data/data/default/KYLIN_YMSGYYXO12/d42b9faf43eafcc9640aa256143d5be3/F1/30b8a8ff5a82458481846e364974bf06
> 2016-04-21 09:16:04,091 INFO  [main] Configuration.deprecation:
> hadoop.native.lib is deprecated. Instead, use io.native.lib.available
> 2016-04-21 09:16:04,292 INFO  [main] util.ChecksumType: Checksum using
> org.apache.hadoop.util.PureJavaCrc32
> 2016-04-21 09:16:04,294 INFO  [main] util.ChecksumType: Checksum can use
> org.apache.hadoop.util.PureJavaCrc32C
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
>
> [jar:file:/usr/hdp/2.2.9.0-3393/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
>
> [jar:file:/usr/hdp/2.2.9.0-3393/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> 2016-04-21 09:16:05,654 INFO  [main] Configuration.deprecation:
> fs.default.name is deprecated. Instead, use fs.defaultFS Scanning ->
>
> /apps/hbase/data/data/default/KYLIN_YMSGYYXO12/d42b9faf43eafcc9640aa256143d5be3/F1/30b8a8ff5a82458481846e364974bf06
> Block index size as per heapsize: 3640
>
> reader=/apps/hbase/data/data/default/KYLIN_YMSGYYXO12/d42b9faf43eafcc9640aa256143d5be3/F1/30b8a8ff5a82458481846e364974bf06,
>     compression=none,
>     cacheConf=CacheConfig:disabled,
>
>
> firstKey=\x00\x0B\x00\x00\x00\x00\x00\x00\x00\x09\x00\x00\x00\x00\x00\x01\xF4/F1:M/0/Put,
>
>
> lastKey=\x00\x0B\x00\x00\x00\x00\x00\x00\x00\x1F\x06-?\x0F"U\x00\x00\x03[^\xD9/F1:M/0/Put,
>     avgKeyLen=35,
>     avgValueLen=5,
>     entries=160988965,
>     length=1832309188
> Trailer:
>     fileinfoOffset=1832308623,
>     loadOnOpenDataOffset=1832306641,
>     dataIndexCount=43,
>     metaIndexCount=0,
>     totalUncomressedBytes=1831809883,
>     entryCount=160988965,
>     compressionCodec=NONE,
>     uncompressedDataIndexSize=5558733,
>     numDataIndexLevels=2,
>     firstDataBlockOffset=0,
>     lastDataBlockOffset=1832250057,
>     comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator,
>     majorVersion=2,
>     minorVersion=3
> Fileinfo:
>     DATA_BLOCK_ENCODING = FAST_DIFF
>     DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00
>     EARLIEST_PUT_TS = \x00\x00\x00\x00\x00\x00\x00\x00
>     MAJOR_COMPACTION_KEY = \xFF
>     MAX_SEQ_ID_KEY = 4
>     TIMERANGE = 0....0
>     hfile.AVG_KEY_LEN = 35
>     hfile.AVG_VALUE_LEN = 5
>     hfile.LASTKEY =
>
> \x00\x16\x00\x0B\x00\x00\x00\x00\x00\x00\x00\x1F\x06-?\x0F"U\x00\x00\x03[^\xD9\x02F1M\x00\x00\x00\x00\x00\x00\x00\x00\x04
> Mid-key:
>
> \x00\x12\x00\x0B\x00\x00\x00\x00\x00\x00\x00\x1D\x04_\x07\x89\x00\x00\x02l\x00\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x00\x00\x00\x007|\xBE$\x00\x00;\x81
> Bloom filter:
>     Not present
> Delete Family Bloom filter:
>     Not present
> Stats:
>    Key length:
>                min = 32.00
>                max = 37.00
>               mean = 35.11
>             stddev = 1.46
>             median = 35.00
>               75% <= 37.00
>               95% <= 37.00
>               98% <= 37.00
>               99% <= 37.00
>             99.9% <= 37.00
>              count = 160988965
>    Row size (bytes):
>                min = 44.00
>                max = 55.00
>               mean = 48.17
>             stddev = 1.43
>             median = 48.00
>               75% <= 50.00
>               95% <= 50.00
>               98% <= 50.00
>               99% <= 50.00
>             99.9% <= 51.97
>              count = 160988965
>    Row size (columns):
>                min = 1.00
>                max = 1.00
>               mean = 1.00
>             stddev = 0.00
>             median = 1.00
>               75% <= 1.00
>               95% <= 1.00
>               98% <= 1.00
>               99% <= 1.00
>             99.9% <= 1.00
>              count = 160988965
>    Val length:
>                min = 4.00
>                max = 12.00
>               mean = 5.06
>             stddev = 0.33
>             median = 5.00
>               75% <= 5.00
>               95% <= 5.00
>               98% <= 6.00
>               99% <= 8.00
>             99.9% <= 9.00
>              count = 160988965
> Key of biggest row:
>
> \x00\x0B\x00\x00\x00\x00\x00\x00\x00\x1F\x04\xDD:\x06\x00U\x00\x00\x00\x8DS\xD2
> Scanned kv count -> 160988965
>
>
>
>
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s).  Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender.  If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.
>


--
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone





This email and any files included with it may contain privileged,
proprietary and/or confidential information that is for the sole use
of the intended recipient(s).  Any disclosure, copying, distribution,
posting, or use of the information contained in or attached to this
email is prohibited unless permitted by the sender.  If you have
received this email in error, please immediately notify the sender
via return email, telephone, or fax and destroy this original transmission
and its included files without reading or saving it in any manner.
Thank you.

Reply via email to