[ 
https://issues.apache.org/jira/browse/HBASE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214133#comment-14214133
 ] 

stack commented on HBASE-12411:
-------------------------------

Is that all it takes?

What about your trick to flip to pread if we are seek+reading already? That 
will still work right?  Because compactions have own file, they'll seek+read?

So, seek + read makes (slight sense -- 10 or 20% better throughput?) if long 
scan and only one scan going on at a time.  Otherwise, if lots of small gets 
and scans, pread makes more sense.

seek +read blocks out preads when its running till HDFS-6735 is fixed.

Compactions are long reads.  Makes sense to do seek + read for these.  Giving 
them their own file means they won't interfere with ongoing gets/scans.  I'll 
suppose we'll open a lot of files with NN when doing a compaction.  Could take 
a while if a bunch of files to open. We open in //?  So, this could  make 
compactions take a bit longer... but compactions are background task so ok?

Add a toggle so its easy to flip it on and then lets try and get some numbers?

Good stuff [~lhofhansl]

> Avoid seek + read completely?
> -----------------------------
>
>                 Key: HBASE-12411
>                 URL: https://issues.apache.org/jira/browse/HBASE-12411
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Performance
>            Reporter: Lars Hofhansl
>         Attachments: 12411.txt
>
>
> In the light of HDFS-6735 we might want to consider refraining from seek + 
> read completely and only perform preads.
> For example currently a compaction can lock out every other scanner over the 
> file which the compaction is currently reading for compaction.
> At the very least we can introduce an option to avoid seek + read, so we can 
> allow testing this in various scenarios.
> This will definitely be of great importance for projects like Phoenix which 
> parallelize queries intra region (and hence readers will used concurrently by 
> multiple scanner with high likelihood.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to