[ 
https://issues.apache.org/jira/browse/HBASE-8691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677639#comment-13677639
 ] 

Enis Soztutar commented on HBASE-8691:
--------------------------------------

This looks very promising from the POC. 
It seems that we can achieve this from purely client side changes. We can do a 
buffer of scan results, that when the scanner is opened, a thread continuously 
tries to fill, while the application processes the results. However, this will 
probably still not be able to achieve the same performance for pure streaming 
from RS. 
Alternatively, for each scanner, we can open a streaming thread in RS (like the 
DataStreamer in hdfs) to pump data to the socket until the socket buffer is 
full as in this patch. Yet another thing we can try is to keep the current scan 
semantics, but each next() will trigger a prefetch to block cache. 
                
> High-Throughput Streaming Scan API
> ----------------------------------
>
>                 Key: HBASE-8691
>                 URL: https://issues.apache.org/jira/browse/HBASE-8691
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 0.95.0
>            Reporter: Sandy Pratt
>              Labels: perfomance, scan
>         Attachments: HRegionServlet.java, README.txt, RecordReceiver.java, 
> ScannerTest.java, StreamHRegionServer.java, StreamReceiverDirect.java, 
> StreamServletDirect.java
>
>
> I've done some working testing various ways to refactor and optimize Scans in 
> HBase, and have found that performance can be dramatically increased by the 
> addition of a streaming scan API.  The attached code constitutes a proof of 
> concept that shows performance increases of almost 4x in some workloads.
> I'd appreciate testing, replication, and comments.  If the approach seems 
> viable, I think such an API should be built into some future version of HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to