[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup
[ https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14993: Attachment: HDFS-14993.patch Status: Patch Available (was: Open) > checkDiskError doesn't work during datanode startup > --- > > Key: HDFS-14993 > URL: https://issues.apache.org/jira/browse/HDFS-14993 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Major > Attachments: HDFS-14993.patch, HDFS-14993.patch > > > the function checkDiskError() is called before addBlockPool, but list > bpSlices is empty this time. So the function check() in FsVolumeImpl.java > does nothing. > @Override > public VolumeCheckResult check(VolumeCheckContext ignored) > throws DiskErrorException { > // TODO:FEDERATION valid synchronization > for (BlockPoolSlice s : bpSlices.values()) { > s.checkDirs(); > } > return VolumeCheckResult.HEALTHY; > } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup
[ https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14993: Status: Open (was: Patch Available) > checkDiskError doesn't work during datanode startup > --- > > Key: HDFS-14993 > URL: https://issues.apache.org/jira/browse/HDFS-14993 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Major > Attachments: HDFS-14993.patch, HDFS-14993.patch > > > the function checkDiskError() is called before addBlockPool, but list > bpSlices is empty this time. So the function check() in FsVolumeImpl.java > does nothing. > @Override > public VolumeCheckResult check(VolumeCheckContext ignored) > throws DiskErrorException { > // TODO:FEDERATION valid synchronization > for (BlockPoolSlice s : bpSlices.values()) { > s.checkDirs(); > } > return VolumeCheckResult.HEALTHY; > } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup
[ https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14993: Attachment: HDFS-14993.patch > checkDiskError doesn't work during datanode startup > --- > > Key: HDFS-14993 > URL: https://issues.apache.org/jira/browse/HDFS-14993 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Yang Yun >Priority: Major > Attachments: HDFS-14993.patch > > > the function checkDiskError() is called before addBlockPool, but list > bpSlices is empty this time. So the function check() in FsVolumeImpl.java > does nothing. > @Override > public VolumeCheckResult check(VolumeCheckContext ignored) > throws DiskErrorException { > // TODO:FEDERATION valid synchronization > for (BlockPoolSlice s : bpSlices.values()) { > s.checkDirs(); > } > return VolumeCheckResult.HEALTHY; > } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14993) checkDiskError doesn't work during datanode startup
Yang Yun created HDFS-14993: --- Summary: checkDiskError doesn't work during datanode startup Key: HDFS-14993 URL: https://issues.apache.org/jira/browse/HDFS-14993 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Yang Yun the function checkDiskError() is called before addBlockPool, but list bpSlices is empty this time. So the function check() in FsVolumeImpl.java does nothing. @Override public VolumeCheckResult check(VolumeCheckContext ignored) throws DiskErrorException { // TODO:FEDERATION valid synchronization for (BlockPoolSlice s : bpSlices.values()) { s.checkDirs(); } return VolumeCheckResult.HEALTHY; } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14627) Improvements to make slow archive storage works on HDFS
[ https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904877#comment-16904877 ] Yang Yun commented on HDFS-14627: - Upload patch for this proposal, including, # One solutin to set spcial configration for spcial StrorageType. # Change timeout for different StrorageType. # Add option to disable block scanner a StrorageType # Check filesystem for a kind of StrorageType.(For some remote mounted filesystem, this checking is important to make sure the storage is mounted rightly) # Sleep a while during a long checkAndUpdate if the difference is big between disk and memory. (in a case, many datanodes with slow disks start as same time) # Add option to save replica cached file to other place.( For slow disk, saving replica info may take long time, can't finish in the time of shutdownHook) # Save the capacity of volume to reduce system call DF. (in some remote disk, DF is expensive) > Improvements to make slow archive storage works on HDFS > --- > > Key: HDFS-14627 > URL: https://issues.apache.org/jira/browse/HDFS-14627 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yang Yun >Priority: Minor > Attachments: HDFS-14627.patch, > data_flow_between_datanode_and_aws_s3.jpg > > > In our setup, we mount archival storage from remote. the write speed is about > 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for > example 'ls', are time consuming. > we add some improvements to make this kind of archive storage works in > currrent hdfs system. > 1. Add multiply to read/write timeout if block saved on archive storage. > 2. Save replica cache file of archive storage to other fast disk for quick > restart datanode, shutdownHook may does not execute if the saving takes too > long time. > 3. Check mount file system before using mounted archive storage. > 4. Reduce or avoid call DF during generating heartbeat report for archive > storage. > 5. Add option to skip archive block during decommission. > 6. Use multi-threads to scan archive storage. > 7. Check archive storage error with retry times. > 8. Add option to disable scan block on archive storage. > 9. Sleep a heartBeat time if there are too many difference when call > checkAndUpdate in DirectoryScanner > 10. An auto-service to scan fsimage and set the storage policy of files > according to policy. > 11. An auto-service to call mover to move the blocks to right storage. > 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS
[ https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14627: Attachment: HDFS-14627.patch > Improvements to make slow archive storage works on HDFS > --- > > Key: HDFS-14627 > URL: https://issues.apache.org/jira/browse/HDFS-14627 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yang Yun >Priority: Minor > Attachments: HDFS-14627.patch, > data_flow_between_datanode_and_aws_s3.jpg > > > In our setup, we mount archival storage from remote. the write speed is about > 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for > example 'ls', are time consuming. > we add some improvements to make this kind of archive storage works in > currrent hdfs system. > 1. Add multiply to read/write timeout if block saved on archive storage. > 2. Save replica cache file of archive storage to other fast disk for quick > restart datanode, shutdownHook may does not execute if the saving takes too > long time. > 3. Check mount file system before using mounted archive storage. > 4. Reduce or avoid call DF during generating heartbeat report for archive > storage. > 5. Add option to skip archive block during decommission. > 6. Use multi-threads to scan archive storage. > 7. Check archive storage error with retry times. > 8. Add option to disable scan block on archive storage. > 9. Sleep a heartBeat time if there are too many difference when call > checkAndUpdate in DirectoryScanner > 10. An auto-service to scan fsimage and set the storage policy of files > according to policy. > 11. An auto-service to call mover to move the blocks to right storage. > 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS
[ https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14627: Attachment: (was: data_flow_between_datanode_and_aws_s3.jpg) > Improvements to make slow archive storage works on HDFS > --- > > Key: HDFS-14627 > URL: https://issues.apache.org/jira/browse/HDFS-14627 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yang Yun >Priority: Minor > Attachments: data_flow_between_datanode_and_aws_s3.jpg > > > In our setup, we mount archival storage from remote. the write speed is about > 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for > example 'ls', are time consuming. > we add some improvements to make this kind of archive storage works in > currrent hdfs system. > 1. Add multiply to read/write timeout if block saved on archive storage. > 2. Save replica cache file of archive storage to other fast disk for quick > restart datanode, shutdownHook may does not execute if the saving takes too > long time. > 3. Check mount file system before using mounted archive storage. > 4. Reduce or avoid call DF during generating heartbeat report for archive > storage. > 5. Add option to skip archive block during decommission. > 6. Use multi-threads to scan archive storage. > 7. Check archive storage error with retry times. > 8. Add option to disable scan block on archive storage. > 9. Sleep a heartBeat time if there are too many difference when call > checkAndUpdate in DirectoryScanner > 10. An auto-service to scan fsimage and set the storage policy of files > according to policy. > 11. An auto-service to call mover to move the blocks to right storage. > 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS
[ https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14627: Environment: (was: !data_flow_between_datanode_and_aws_s3.jpg!) > Improvements to make slow archive storage works on HDFS > --- > > Key: HDFS-14627 > URL: https://issues.apache.org/jira/browse/HDFS-14627 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yang Yun >Priority: Minor > Attachments: data_flow_between_datanode_and_aws_s3.jpg > > > In our setup, we mount archival storage from remote. the write speed is about > 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for > example 'ls', are time consuming. > we add some improvements to make this kind of archive storage works in > currrent hdfs system. > 1. Add multiply to read/write timeout if block saved on archive storage. > 2. Save replica cache file of archive storage to other fast disk for quick > restart datanode, shutdownHook may does not execute if the saving takes too > long time. > 3. Check mount file system before using mounted archive storage. > 4. Reduce or avoid call DF during generating heartbeat report for archive > storage. > 5. Add option to skip archive block during decommission. > 6. Use multi-threads to scan archive storage. > 7. Check archive storage error with retry times. > 8. Add option to disable scan block on archive storage. > 9. Sleep a heartBeat time if there are too many difference when call > checkAndUpdate in DirectoryScanner > 10. An auto-service to scan fsimage and set the storage policy of files > according to policy. > 11. An auto-service to call mover to move the blocks to right storage. > 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS
[ https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-14627: Attachment: data_flow_between_datanode_and_aws_s3.jpg > Improvements to make slow archive storage works on HDFS > --- > > Key: HDFS-14627 > URL: https://issues.apache.org/jira/browse/HDFS-14627 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Yang Yun >Priority: Minor > Attachments: data_flow_between_datanode_and_aws_s3.jpg > > > In our setup, we mount archival storage from remote. the write speed is about > 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for > example 'ls', are time consuming. > we add some improvements to make this kind of archive storage works in > currrent hdfs system. > 1. Add multiply to read/write timeout if block saved on archive storage. > 2. Save replica cache file of archive storage to other fast disk for quick > restart datanode, shutdownHook may does not execute if the saving takes too > long time. > 3. Check mount file system before using mounted archive storage. > 4. Reduce or avoid call DF during generating heartbeat report for archive > storage. > 5. Add option to skip archive block during decommission. > 6. Use multi-threads to scan archive storage. > 7. Check archive storage error with retry times. > 8. Add option to disable scan block on archive storage. > 9. Sleep a heartBeat time if there are too many difference when call > checkAndUpdate in DirectoryScanner > 10. An auto-service to scan fsimage and set the storage policy of files > according to policy. > 11. An auto-service to call mover to move the blocks to right storage. > 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14627) Improvements to make slow archive storage works on HDFS
Yang Yun created HDFS-14627: --- Summary: Improvements to make slow archive storage works on HDFS Key: HDFS-14627 URL: https://issues.apache.org/jira/browse/HDFS-14627 Project: Hadoop HDFS Issue Type: Improvement Environment: !data_flow_between_datanode_and_aws_s3.jpg! Reporter: Yang Yun Attachments: data_flow_between_datanode_and_aws_s3.jpg In our setup, we mount archival storage from remote. the write speed is about 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for example 'ls', are time consuming. we add some improvements to make this kind of archive storage works in currrent hdfs system. 1. Add multiply to read/write timeout if block saved on archive storage. 2. Save replica cache file of archive storage to other fast disk for quick restart datanode, shutdownHook may does not execute if the saving takes too long time. 3. Check mount file system before using mounted archive storage. 4. Reduce or avoid call DF during generating heartbeat report for archive storage. 5. Add option to skip archive block during decommission. 6. Use multi-threads to scan archive storage. 7. Check archive storage error with retry times. 8. Add option to disable scan block on archive storage. 9. Sleep a heartBeat time if there are too many difference when call checkAndUpdate in DirectoryScanner 10. An auto-service to scan fsimage and set the storage policy of files according to policy. 11. An auto-service to call mover to move the blocks to right storage. 12. Dedup files on remote storage if the storage is reliable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13377) The owner of folder can set quota for his sub folder
[ https://issues.apache.org/jira/browse/HDFS-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-13377: Attachment: HADOOP-13377.patch > The owner of folder can set quota for his sub folder > > > Key: HDFS-13377 > URL: https://issues.apache.org/jira/browse/HDFS-13377 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yang Yun >Priority: Minor > Attachments: HADOOP-13377.patch > > > Currently, only super user can set quota. That is huge burden for > administrator in a large system. Add a new feature to let the owner of a > folder also has the privilege to set quota for his sub folders. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13377) The owner of folder can set quota for his sub folder
Yang Yun created HDFS-13377: --- Summary: The owner of folder can set quota for his sub folder Key: HDFS-13377 URL: https://issues.apache.org/jira/browse/HDFS-13377 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yang Yun Currently, only super user can set quota. That is huge burden for administrator in a large system. Add a new feature to let the owner of a folder also has the privilege to set quota for his sub folders. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317619#comment-16317619 ] Yang Yun commented on HDFS-12964: - Yes, the pollnewData() has same function, with it user can make the external blocking read. One more thing, the pollnewData() calls getFileInfo and readBlockLength every time. That will increase the burden to a busy HDFS system. Do we need to add the cache for RPC to Namenode and connection to Datanode? In our productive environments, many readers may read small messages only from one block for a long time with high speed. > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Minor > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keeps writing. Another thread opens same > file and keeps reading it at the same time, also want to get the newest > content of file. that happens in many environments, for example, in some > message transmission applications. And it also requires the new content can > be read timely and effectively, for there maybe many tasks are working in > same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16315926#comment-16315926 ] Yang Yun commented on HDFS-12964: - the DFSSpinInputStream is only useful for special case. It will hang there when read or seek file if it goes beyond the current EOF and the file is opened by other. The retry logic in DFSInputStream only try one more time, is different with the hanging. I'm not sure if we can add a flag to switch the DFSInputStream to this kind of behavior. > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Minor > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keeps writing. Another thread opens same > file and keeps reading it at the same time, also want to get the newest > content of file. that happens in many environments, for example, in some > message transmission applications. And it also requires the new content can > be read timely and effectively, for there maybe many tasks are working in > same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312795#comment-16312795 ] Yang Yun commented on HDFS-12964: - Yes, it's similar function with FsShel '-tail -f'. The difference is, this is input stream API, can be used for programming, it doesn't request Namenode every time and keep a cache for the connection to Datanode. > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Minor > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keeps writing. Another thread opens same > file and keeps reading it at the same time, also want to get the newest > content of file. that happens in many environments, for example, in some > message transmission applications. And it also requires the new content can > be read timely and effectively, for there maybe many tasks are working in > same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-12964: Description: One thread opens a HDFS file and keeps writing. Another thread opens same file and keeps reading it at the same time, also want to get the newest content of file. that happens in many environments, for example, in some message transmission applications. And it also requires the new content can be read timely and effectively, for there maybe many tasks are working in same time. (was: One thread opens a HDFS file and keep writing. Another thread opens same file and keep reading it at the same time, also want to get the newest content of file. that happens at in many environments, for example, in some message transmission applications. And it also requires the new content can be read timely and effectively, for there maybe many tasks are working in same time.) > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Minor > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keeps writing. Another thread opens same > file and keeps reading it at the same time, also want to get the newest > content of file. that happens in many environments, for example, in some > message transmission applications. And it also requires the new content can > be read timely and effectively, for there maybe many tasks are working in > same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-12964: Priority: Minor (was: Critical) > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Minor > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keep writing. Another thread opens same > file and keep reading it at the same time, also want to get the newest > content of file. that happens at in many environments, for example, in some > message transmission applications. And it also requires the new content can > be read timely and effectively, for there maybe many tasks are working in > same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-12964: Description: One thread opens a HDFS file and keep writing. Another thread opens same file and keep reading it at the same time, also want to get the newest content of file. that happens at in many environments, for example, in some message transmission applications. And it also requires the new content can be read timely and effectively, for there maybe many tasks are working in same time. (was: One thread opens a HDFS file and keep writing. Another thread opens same file and keep reading it at the same time, also want to get the newest content of file. that happens at in many environments, for example, talos message transmission. And it also requires the new content can be read timely and effectively, for there maybe many tasks are working in same time.) > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Critical > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keep writing. Another thread opens same > file and keep reading it at the same time, also want to get the newest > content of file. that happens at in many environments, for example, in some > message transmission applications. And it also requires the new content can > be read timely and effectively, for there maybe many tasks are working in > same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-12964: Description: One thread opens a HDFS file and keep writing. Another thread opens same file and keep reading it at the same time, also want to get the newest content of file. that happens at in many environments, for example, talos message transmission. And it also requires the new content can be read timely and effectively, for there maybe many tasks are working in same time. (was: One thread opens a HDFS file and keep writing. Another thread opens same file and keep reading it at the same time, also want to get the newest content of file. that happen at in many environments, for example, talos message transmission. And it also require the new content can be read timely and effectively, for there maybe many tasks are working in same time.) > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Critical > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keep writing. Another thread opens same > file and keep reading it at the same time, also want to get the newest > content of file. that happens at in many environments, for example, talos > message transmission. And it also requires the new content can be read timely > and effectively, for there maybe many tasks are working in same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
[ https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yun updated HDFS-12964: Attachment: HADOOP-12964.001.patch Add new DFSSpinInputStream extended from DFSInputStream. If the file is being written by somebody else, DFSSpinInputStream will keep updating the file's length information if the reading goes beyond the current EOF. And also add cache for the requests of RefreshLocatedBlocks to Namenode and the connections to Datanode. > Read a opened file timely and effectively when it's being written by other > -- > > Key: HDFS-12964 > URL: https://issues.apache.org/jira/browse/HDFS-12964 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Yang Yun >Priority: Critical > Attachments: HADOOP-12964.001.patch > > > One thread opens a HDFS file and keep writing. Another thread opens same > file and keep reading it at the same time, also want to get the newest > content of file. that happen at in many environments, for example, talos > message transmission. And it also require the new content can be read timely > and effectively, for there maybe many tasks are working in same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12964) Read a opened file timely and effectively when it's being written by other
Yang Yun created HDFS-12964: --- Summary: Read a opened file timely and effectively when it's being written by other Key: HDFS-12964 URL: https://issues.apache.org/jira/browse/HDFS-12964 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: Yang Yun Priority: Critical One thread opens a HDFS file and keep writing. Another thread opens same file and keep reading it at the same time, also want to get the newest content of file. that happen at in many environments, for example, talos message transmission. And it also require the new content can be read timely and effectively, for there maybe many tasks are working in same time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org