[jira] [Commented] (HDFS-14809) Reduce BlockReaderLocal RPC calls

2020-03-15 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059627#comment-17059627
 ] 

Lisheng Sun commented on HDFS-14809:


[~ken_1...@163.com]
{quote}
 As far as i know the shm and slot is used to record reference count by client, 
and read by DN to determin whether a replica can be uncached by CacheManager.
{quote}
The shm and slot also provides a role for the validity of the synchronized 
replica.
there are 2 question:
1. How does the client know if the replica is still valid in your 
implementation?
2. a lot dfsinputstreams of the dfsclient SSR is for one replica,  rpc is 
needed each dfsinputstream every time.

> Reduce BlockReaderLocal RPC calls
> -
>
> Key: HDFS-14809
> URL: https://issues.apache.org/jira/browse/HDFS-14809
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: KenCao
>Assignee: KenCao
>Priority: Major
> Attachments: HADOOP-14809
>
>
> as we known, the hdfs client java lib uses BlockReaderLocal for short circuit 
> read by default, which allocate shared memory first, and make a slot within 
> it. After all these steps, it will request the fds from the DataNode. 
> However, the slot and shared memory sturcture is only used by DataNode when 
> uncaching replicas, the client process can work well just with the fds asked 
> later and it is nearly impossible to cache replicas in product environment. 
> The api to release fds is called by client only with the slot given, the fds 
> is close in the client process finally.  
> so i think we can make a new BlockReader implementation which just requests 
> the fds, and it will reduce the rpc calls from 3(allocate shm, request fds, 
> release fds) to 1(request fds).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14809) Reduce BlockReaderLocal RPC calls

2020-03-12 Thread KenCao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058013#comment-17058013
 ] 

KenCao commented on HDFS-14809:
---

Hello [~leosun08] . I think, in theory, it is possible to just request the fd 
from DN and execute your read logic.And in my implementation, the client will 
use others readers if it failed to get BlockReaderLocal2, which will work as 
before. As far as i know the shm and slot is used to record reference count by 
client, and read by DN to determin whether a replica can be uncached by 
CacheManager. However the CacheManager is never used in my situation, and 
alluxio may be a better choice. 

> Reduce BlockReaderLocal RPC calls
> -
>
> Key: HDFS-14809
> URL: https://issues.apache.org/jira/browse/HDFS-14809
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: KenCao
>Assignee: KenCao
>Priority: Major
> Attachments: HADOOP-14809
>
>
> as we known, the hdfs client java lib uses BlockReaderLocal for short circuit 
> read by default, which allocate shared memory first, and make a slot within 
> it. After all these steps, it will request the fds from the DataNode. 
> However, the slot and shared memory sturcture is only used by DataNode when 
> uncaching replicas, the client process can work well just with the fds asked 
> later and it is nearly impossible to cache replicas in product environment. 
> The api to release fds is called by client only with the slot given, the fds 
> is close in the client process finally.  
> so i think we can make a new BlockReader implementation which just requests 
> the fds, and it will reduce the rpc calls from 3(allocate shm, request fds, 
> release fds) to 1(request fds).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14809) Reduce BlockReaderLocal RPC calls

2020-03-07 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17054313#comment-17054313
 ] 

Lisheng Sun commented on HDFS-14809:


Thank [~ken_1...@163.com] for reporting this jira. 

I don't quite understand some places in this jira description.
{quote}
 However, the slot and shared memory sturcture is only used by DataNode when 
uncaching replicas,
{quote}
 you mean  the client don't use the slot and shared memory sturcture when 
uncaching replicas. 
How does the client know if the replica is invalid at this time?  
The slot's effect is effectiveness of synchronizing replicas and number of 
replica ref on client and DN.
According to your implementation only requests the fds , request an rpc fds 
before each short circuit read.
Please correct me if i was wrong.
Regarding the performance of short-circuit read, you can pay attention to this 
jira HDFS-13639 and  HDFS-13564

> Reduce BlockReaderLocal RPC calls
> -
>
> Key: HDFS-14809
> URL: https://issues.apache.org/jira/browse/HDFS-14809
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: KenCao
>Assignee: KenCao
>Priority: Major
> Attachments: HADOOP-14809
>
>
> as we known, the hdfs client java lib uses BlockReaderLocal for short circuit 
> read by default, which allocate shared memory first, and make a slot within 
> it. After all these steps, it will request the fds from the DataNode. 
> However, the slot and shared memory sturcture is only used by DataNode when 
> uncaching replicas, the client process can work well just with the fds asked 
> later and it is nearly impossible to cache replicas in product environment. 
> The api to release fds is called by client only with the slot given, the fds 
> is close in the client process finally.  
> so i think we can make a new BlockReader implementation which just requests 
> the fds, and it will reduce the rpc calls from 3(allocate shm, request fds, 
> release fds) to 1(request fds).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14809) Reduce BlockReaderLocal RPC calls

2020-03-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050525#comment-17050525
 ] 

Hadoop QA commented on HDFS-14809:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
6s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} HDFS-14809 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-14809 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979848/HADOOP-14809 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28890/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Reduce BlockReaderLocal RPC calls
> -
>
> Key: HDFS-14809
> URL: https://issues.apache.org/jira/browse/HDFS-14809
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: KenCao
>Assignee: kencao
>Priority: Major
> Attachments: HADOOP-14809
>
>
> as we known, the hdfs client java lib uses BlockReaderLocal for short circuit 
> read by default, which allocate shared memory first, and make a slot within 
> it. After all these steps, it will request the fds from the DataNode. 
> However, the slot and shared memory sturcture is only used by DataNode when 
> uncaching replicas, the client process can work well just with the fds asked 
> later and it is nearly impossible to cache replicas in product environment. 
> The api to release fds is called by client only with the slot given, the fds 
> is close in the client process finally.  
> so i think we can make a new BlockReader implementation which just requests 
> the fds, and it will reduce the rpc calls from 3(allocate shm, request fds, 
> release fds) to 1(request fds).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14809) Reduce BlockReaderLocal RPC calls

2020-03-03 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050521#comment-17050521
 ] 

Wei-Chiu Chuang commented on HDFS-14809:


Updated the summary based on the suggestion.

I'm extremely sorry for missing out this one. [~leosun08] [~openinx] does this 
make sense to you?

> Reduce BlockReaderLocal RPC calls
> -
>
> Key: HDFS-14809
> URL: https://issues.apache.org/jira/browse/HDFS-14809
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: KenCao
>Assignee: kencao
>Priority: Major
> Attachments: HADOOP-14809
>
>
> as we known, the hdfs client java lib uses BlockReaderLocal for short circuit 
> read by default, which allocate shared memory first, and make a slot within 
> it. After all these steps, it will request the fds from the DataNode. 
> However, the slot and shared memory sturcture is only used by DataNode when 
> uncaching replicas, the client process can work well just with the fds asked 
> later and it is nearly impossible to cache replicas in product environment. 
> The api to release fds is called by client only with the slot given, the fds 
> is close in the client process finally.  
> so i think we can make a new BlockReader implementation which just requests 
> the fds, and it will reduce the rpc calls from 3(allocate shm, request fds, 
> release fds) to 1(request fds).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org