[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146220#comment-15146220 ] Chris Nauroth commented on HDFS-7694: - As per discussion on HADOOP-12805 with the HBase community, I'd like to make {{CanUnbuffer}} a {{Public}} interface. I'd appreciate if the original contributors here could help chime in on HADOOP-12805. Thanks! > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0, 2.6.4 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119738#comment-15119738 ] Junping Du commented on HDFS-7694: -- I have cherry-pick it to branch-2.6. > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0, 2.6.4 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117491#comment-15117491 ] Junping Du commented on HDFS-7694: -- Thanks [~cmccabe] for confirmation. I will cherry-pick this patch to branch-2.6 later when build failure (caused by HADOOP-12715) is figured out. > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116622#comment-15116622 ] Colin Patrick McCabe commented on HDFS-7694: Hi, [~djp]. This change is compatible, since people are not expected to be subclassing {{FSDataInputStream}}. So it seems fine to backport to 2.6, if the maintainers of that branch think it will be useful there. > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114065#comment-15114065 ] Ted Yu commented on HDFS-7694: -- [~cmccabe] [~djp]: Patch applies on branch-2.6 cleanly. Can you commit to branch-2.6 ? This would benefit HBASE-9393 Thanks > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114069#comment-15114069 ] Ted Yu commented on HDFS-7694: -- [~cmccabe] [~djp]: Patch applies on branch-2.6 cleanly. Can you commit to branch-2.6 ? This would benefit HBASE-9393 Thanks > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114100#comment-15114100 ] Junping Du commented on HDFS-7694: -- bq. Is there compatibility concern for backporting this to 2.6 branch? I don't think so, after checking the patch and from description above "This could be added to the existing APIs as an optional interface." [~cmccabe], what do you think? > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113588#comment-15113588 ] Ted Yu commented on HDFS-7694: -- Is there compatibility concern for backporting this to 2.6 branch ? Thanks > FSDataInputStream should support "unbuffer" > --- > > Key: HDFS-7694 > URL: https://issues.apache.org/jira/browse/HDFS-7694 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.7.0 > > Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, > HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch > > > For applications that have many open HDFS (or other Hadoop filesystem) files, > it would be useful to have an API to clear readahead buffers and sockets. > This could be added to the existing APIs as an optional interface, in much > the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323836#comment-14323836 ] Hudson commented on HDFS-7694: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #97 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/97/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319853#comment-14319853 ] Hudson commented on HDFS-7694: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #103 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/103/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319887#comment-14319887 ] Hudson commented on HDFS-7694: -- FAILURE: Integrated in Hadoop-Yarn-trunk #837 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/837/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320225#comment-14320225 ] Hudson commented on HDFS-7694: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #104 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/104/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320130#comment-14320130 ] Hudson commented on HDFS-7694: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2035 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2035/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320270#comment-14320270 ] Hudson commented on HDFS-7694: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2054 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2054/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318742#comment-14318742 ] Hudson commented on HDFS-7694: -- FAILURE: Integrated in Hadoop-trunk-Commit #7091 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7091/]) HDFS-7694. FSDataInputStream should support unbuffer (cmccabe) (cmccabe: rev 6b39ad0865cb2a7960dd59d68178f0bf28865ce2) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataInputStream.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CanUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1431#comment-1431 ] Hadoop QA commented on HDFS-7694: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12698303/HDFS-7694.005.patch against trunk revision 8a54384. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9555//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9555//console This message is automatically generated. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317561#comment-14317561 ] Yi Liu commented on HDFS-7694: -- +1 pending Jenkins. Thanks Colin. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317571#comment-14317571 ] Uma Maheswara Rao G commented on HDFS-7694: --- Yes, this should be useful. +1 FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315452#comment-14315452 ] Hadoop QA commented on HDFS-7694: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697877/HDFS-7694.004.patch against trunk revision e9d26fe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9522//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9522//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/9522//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9522//console This message is automatically generated. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315334#comment-14315334 ] Yi Liu commented on HDFS-7694: -- The latest patch looks good to me, thanks Colin. One comment: could we verify that we still read successfully and verify the read content is correct after we do {{unbuffer}}? FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch, HDFS-7694.004.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309780#comment-14309780 ] Colin Patrick McCabe commented on HDFS-7694: bq. CanUnbuffer ain't too pretty. Unbufferable is about as ugly. Its fine I suppose as is. It's consistent with our other input stream extension interfaces such as {{Syncable}}, {{CanSetReadahead}}, etc. The problem is that we can't add the new APIs to {{FSInputStream}}, or else we'd break a bunch of non-HDFS streams (in and out of the tree) that don't implement the new API. I guess Java is adding default implementations for interface functions in some future version... too bad we're not there yet. bq.l In DFSIS#unbuffer, should we be resetting data members back to zero, etc? I'm not sure what else we'd reset. This isn't changing the {{closed}} state, it's not a seek so the {{pos}} is not affected, it's not changing the {{cachingStrategy}} or {{fileEncryptionInfo}}... we certainly don't want to clear the block location info because then we need to do an RPC to the NN to get it again... Actually I do see one thing we should change. We should set {{blockEnd}} to -1. Otherwise, {{seek}} may attempt to use {{blockReader}} even though it's {{null}}. It seems like this is also a problem in {{closeCurrentBlockReader}}. And let me add a {{seek}} after the unbuffer in {{testUnbufferClosesSockets}} to make sure that this doesn't regress. bq. In testOpenManyFilesViaTcp, we assert we can read but is there a reason why we would not be able to that unbuffer enables? (pardon if dumb question) Not a dumb question at all. What I was testing here was that opening a lot of files didn't consume too many resources. In my local test environment, I increased {{NUM_OPENS}} to be a really big number... I didn't want to burden Jenkins too much, though. {{testUnbufferClosesSockets}} is a more direct and straightforward test than {{testOpenManyFilesViaTcp}}... the latter is perhaps more of a stress test. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309862#comment-14309862 ] Colin Patrick McCabe commented on HDFS-7694: I filed HDFS-7744 for the existing seek after setReadahead bug. It's not good form to combine a new feature with a bugfix. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310259#comment-14310259 ] Hadoop QA commented on HDFS-7694: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12697115/HDFS-7694.003.patch against trunk revision 4c48432. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ipc.TestCallQueueManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9466//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9466//console This message is automatically generated. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, HDFS-7694.003.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308630#comment-14308630 ] Hadoop QA commented on HDFS-7694: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696959/HDFS-7694.002.patch against trunk revision 9d91069. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9452//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9452//console This message is automatically generated. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308729#comment-14308729 ] stack commented on HDFS-7694: - CanUnbuffer ain't too pretty. Unbufferable is about as ugly. Its fine I suppose as is. In DFSIS#unbuffer, should we be resetting data members back to zero, etc? In testOpenManyFilesViaTcp, we assert we can read but is there a reason why we would not be able to that unbuffer enables? (pardon if dumb question) FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308568#comment-14308568 ] Colin Patrick McCabe commented on HDFS-7694: bq. One question, in what cases, user needs to unbuffer instead of closing the stream? Good question. The main answer is that re-opening a stream will cause a getBlockLocations RPC to the NameNode. Some applications cache a lot of open streams in order to avoid generating a lot of NameNode traffic. HBase is one, Impala is another. This change is a really easy way to let those applications save memory without generating a lot of RPC load on the NN. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308531#comment-14308531 ] Yi Liu commented on HDFS-7694: -- Hi Colin, could you please re-base the patch? One question, in what cases, user needs to unbuffer instead of closing the stream? FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer
[ https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296248#comment-14296248 ] Hadoop QA commented on HDFS-7694: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695104/HDFS-7694.001.patch against trunk revision caf7298. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9362//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9362//console This message is automatically generated. FSDataInputStream should support unbuffer --- Key: HDFS-7694 URL: https://issues.apache.org/jira/browse/HDFS-7694 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7694.001.patch For applications that have many open HDFS (or other Hadoop filesystem) files, it would be useful to have an API to clear readahead buffers and sockets. This could be added to the existing APIs as an optional interface, in much the same way as we added setReadahead / setDropBehind / etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)