[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-11-19 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14993:

Attachment: HDFS-14993.patch
Status: Patch Available  (was: Open)

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch, HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-11-19 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14993:

Status: Open  (was: Patch Available)

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch, HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-11-19 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14993:

Attachment: HDFS-14993.patch

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-11-19 Thread Yang Yun (Jira)
Yang Yun created HDFS-14993:
---

 Summary: checkDiskError doesn't work during datanode startup
 Key: HDFS-14993
 URL: https://issues.apache.org/jira/browse/HDFS-14993
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Yang Yun


the function checkDiskError() is called before addBlockPool, but list bpSlices 
is empty this time. So the function check() in FsVolumeImpl.java does nothing.



@Override
public VolumeCheckResult check(VolumeCheckContext ignored)
 throws DiskErrorException {
 // TODO:FEDERATION valid synchronization
 for (BlockPoolSlice s : bpSlices.values()) {
 s.checkDirs();
 }
 return VolumeCheckResult.HEALTHY;
}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14627) Improvements to make slow archive storage works on HDFS

2019-08-12 Thread Yang Yun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904877#comment-16904877
 ] 

Yang Yun commented on HDFS-14627:
-

Upload patch for this proposal, including,
 # One solutin to set spcial configration for spcial StrorageType.
 # Change timeout for different StrorageType.
 # Add option to disable block scanner a StrorageType
 # Check filesystem for a kind of StrorageType.(For some remote mounted 
filesystem, this checking is important to make sure the storage is mounted 
rightly)
 # Sleep a while during a long checkAndUpdate if the difference is big between 
disk and memory. (in a case, many datanodes with slow disks start as same time)
 # Add option to save replica cached file to other place.( For slow disk, 
saving replica info may take long time, can't finish in the time of 
shutdownHook)
 # Save the capacity of volume to reduce system call DF. (in some remote disk, 
DF is expensive)

 

> Improvements to make slow archive storage works on HDFS
> ---
>
> Key: HDFS-14627
> URL: https://issues.apache.org/jira/browse/HDFS-14627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HDFS-14627.patch, 
> data_flow_between_datanode_and_aws_s3.jpg
>
>
> In our setup, we mount archival storage from remote. the write speed is about 
> 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for 
> example 'ls', are time consuming.
> we add some improvements to make this kind of archive storage works in 
> currrent hdfs system.
> 1. Add multiply to read/write timeout if block saved on archive storage.
> 2. Save replica cache file of archive storage to other fast disk for quick 
> restart datanode, shutdownHook may does not execute if the saving takes too 
> long time.
> 3. Check mount file system before using mounted archive storage.
> 4. Reduce or avoid call DF during generating heartbeat report for archive 
> storage.
> 5. Add option to skip archive block during decommission.
> 6. Use multi-threads to scan archive storage.
> 7. Check archive storage error with retry times.
> 8. Add option to disable scan block on archive storage.
> 9. Sleep a heartBeat time if there are too many difference when call 
> checkAndUpdate in DirectoryScanner
> 10. An auto-service to scan fsimage and set the storage policy of files 
> according to policy.
> 11. An auto-service to call mover to move the blocks to right storage.
> 12. Dedup files on remote storage if the storage is reliable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS

2019-08-12 Thread Yang Yun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14627:

Attachment: HDFS-14627.patch

> Improvements to make slow archive storage works on HDFS
> ---
>
> Key: HDFS-14627
> URL: https://issues.apache.org/jira/browse/HDFS-14627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HDFS-14627.patch, 
> data_flow_between_datanode_and_aws_s3.jpg
>
>
> In our setup, we mount archival storage from remote. the write speed is about 
> 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for 
> example 'ls', are time consuming.
> we add some improvements to make this kind of archive storage works in 
> currrent hdfs system.
> 1. Add multiply to read/write timeout if block saved on archive storage.
> 2. Save replica cache file of archive storage to other fast disk for quick 
> restart datanode, shutdownHook may does not execute if the saving takes too 
> long time.
> 3. Check mount file system before using mounted archive storage.
> 4. Reduce or avoid call DF during generating heartbeat report for archive 
> storage.
> 5. Add option to skip archive block during decommission.
> 6. Use multi-threads to scan archive storage.
> 7. Check archive storage error with retry times.
> 8. Add option to disable scan block on archive storage.
> 9. Sleep a heartBeat time if there are too many difference when call 
> checkAndUpdate in DirectoryScanner
> 10. An auto-service to scan fsimage and set the storage policy of files 
> according to policy.
> 11. An auto-service to call mover to move the blocks to right storage.
> 12. Dedup files on remote storage if the storage is reliable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS

2019-07-02 Thread Yang Yun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14627:

Attachment: (was: data_flow_between_datanode_and_aws_s3.jpg)

> Improvements to make slow archive storage works on HDFS
> ---
>
> Key: HDFS-14627
> URL: https://issues.apache.org/jira/browse/HDFS-14627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Priority: Minor
> Attachments: data_flow_between_datanode_and_aws_s3.jpg
>
>
> In our setup, we mount archival storage from remote. the write speed is about 
> 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for 
> example 'ls', are time consuming.
> we add some improvements to make this kind of archive storage works in 
> currrent hdfs system.
> 1. Add multiply to read/write timeout if block saved on archive storage.
> 2. Save replica cache file of archive storage to other fast disk for quick 
> restart datanode, shutdownHook may does not execute if the saving takes too 
> long time.
> 3. Check mount file system before using mounted archive storage.
> 4. Reduce or avoid call DF during generating heartbeat report for archive 
> storage.
> 5. Add option to skip archive block during decommission.
> 6. Use multi-threads to scan archive storage.
> 7. Check archive storage error with retry times.
> 8. Add option to disable scan block on archive storage.
> 9. Sleep a heartBeat time if there are too many difference when call 
> checkAndUpdate in DirectoryScanner
> 10. An auto-service to scan fsimage and set the storage policy of files 
> according to policy.
> 11. An auto-service to call mover to move the blocks to right storage.
> 12. Dedup files on remote storage if the storage is reliable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS

2019-07-02 Thread Yang Yun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14627:

Environment: (was: !data_flow_between_datanode_and_aws_s3.jpg!)

> Improvements to make slow archive storage works on HDFS
> ---
>
> Key: HDFS-14627
> URL: https://issues.apache.org/jira/browse/HDFS-14627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Priority: Minor
> Attachments: data_flow_between_datanode_and_aws_s3.jpg
>
>
> In our setup, we mount archival storage from remote. the write speed is about 
> 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for 
> example 'ls', are time consuming.
> we add some improvements to make this kind of archive storage works in 
> currrent hdfs system.
> 1. Add multiply to read/write timeout if block saved on archive storage.
> 2. Save replica cache file of archive storage to other fast disk for quick 
> restart datanode, shutdownHook may does not execute if the saving takes too 
> long time.
> 3. Check mount file system before using mounted archive storage.
> 4. Reduce or avoid call DF during generating heartbeat report for archive 
> storage.
> 5. Add option to skip archive block during decommission.
> 6. Use multi-threads to scan archive storage.
> 7. Check archive storage error with retry times.
> 8. Add option to disable scan block on archive storage.
> 9. Sleep a heartBeat time if there are too many difference when call 
> checkAndUpdate in DirectoryScanner
> 10. An auto-service to scan fsimage and set the storage policy of files 
> according to policy.
> 11. An auto-service to call mover to move the blocks to right storage.
> 12. Dedup files on remote storage if the storage is reliable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14627) Improvements to make slow archive storage works on HDFS

2019-07-02 Thread Yang Yun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14627:

Attachment: data_flow_between_datanode_and_aws_s3.jpg

> Improvements to make slow archive storage works on HDFS
> ---
>
> Key: HDFS-14627
> URL: https://issues.apache.org/jira/browse/HDFS-14627
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yang Yun
>Priority: Minor
> Attachments: data_flow_between_datanode_and_aws_s3.jpg
>
>
> In our setup, we mount archival storage from remote. the write speed is about 
> 20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for 
> example 'ls', are time consuming.
> we add some improvements to make this kind of archive storage works in 
> currrent hdfs system.
> 1. Add multiply to read/write timeout if block saved on archive storage.
> 2. Save replica cache file of archive storage to other fast disk for quick 
> restart datanode, shutdownHook may does not execute if the saving takes too 
> long time.
> 3. Check mount file system before using mounted archive storage.
> 4. Reduce or avoid call DF during generating heartbeat report for archive 
> storage.
> 5. Add option to skip archive block during decommission.
> 6. Use multi-threads to scan archive storage.
> 7. Check archive storage error with retry times.
> 8. Add option to disable scan block on archive storage.
> 9. Sleep a heartBeat time if there are too many difference when call 
> checkAndUpdate in DirectoryScanner
> 10. An auto-service to scan fsimage and set the storage policy of files 
> according to policy.
> 11. An auto-service to call mover to move the blocks to right storage.
> 12. Dedup files on remote storage if the storage is reliable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14627) Improvements to make slow archive storage works on HDFS

2019-07-02 Thread Yang Yun (JIRA)
Yang Yun created HDFS-14627:
---

 Summary: Improvements to make slow archive storage works on HDFS
 Key: HDFS-14627
 URL: https://issues.apache.org/jira/browse/HDFS-14627
 Project: Hadoop HDFS
  Issue Type: Improvement
 Environment: !data_flow_between_datanode_and_aws_s3.jpg!
Reporter: Yang Yun
 Attachments: data_flow_between_datanode_and_aws_s3.jpg

In our setup, we mount archival storage from remote. the write speed is about 
20M/Sec, the read speed is about 40M/Sec, and the normal file operations, for 
example 'ls', are time consuming.
we add some improvements to make this kind of archive storage works in currrent 
hdfs system.

1. Add multiply to read/write timeout if block saved on archive storage.
2. Save replica cache file of archive storage to other fast disk for quick 
restart datanode, shutdownHook may does not execute if the saving takes too 
long time.
3. Check mount file system before using mounted archive storage.
4. Reduce or avoid call DF during generating heartbeat report for archive 
storage.
5. Add option to skip archive block during decommission.
6. Use multi-threads to scan archive storage.
7. Check archive storage error with retry times.
8. Add option to disable scan block on archive storage.
9. Sleep a heartBeat time if there are too many difference when call 
checkAndUpdate in DirectoryScanner
10. An auto-service to scan fsimage and set the storage policy of files 
according to policy.
11. An auto-service to call mover to move the blocks to right storage.
12. Dedup files on remote storage if the storage is reliable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13377) The owner of folder can set quota for his sub folder

2018-03-30 Thread Yang Yun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-13377:

Attachment: HADOOP-13377.patch

> The owner of folder can set quota for his sub folder
> 
>
> Key: HDFS-13377
> URL: https://issues.apache.org/jira/browse/HDFS-13377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HADOOP-13377.patch
>
>
> Currently, only  super user can set quota. That is huge burden for 
> administrator in a large system. Add a new feature to let the owner of a 
> folder also has the privilege to set quota for his sub folders. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13377) The owner of folder can set quota for his sub folder

2018-03-30 Thread Yang Yun (JIRA)
Yang Yun created HDFS-13377:
---

 Summary: The owner of folder can set quota for his sub folder
 Key: HDFS-13377
 URL: https://issues.apache.org/jira/browse/HDFS-13377
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yang Yun


Currently, only  super user can set quota. That is huge burden for 
administrator in a large system. Add a new feature to let the owner of a folder 
also has the privilege to set quota for his sub folders. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2018-01-08 Thread Yang Yun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317619#comment-16317619
 ] 

Yang Yun commented on HDFS-12964:
-

Yes, the pollnewData() has same function, with it user can make the external 
blocking read. 
One more thing, the pollnewData() calls getFileInfo and readBlockLength every 
time. That will increase the burden to a busy HDFS system. Do we need to add 
the cache for RPC to Namenode and connection to Datanode? In our productive 
environments, many readers may read small messages only from one block for a 
long time with high speed. 

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keeps writing.  Another thread opens same 
> file and keeps reading it at the same time, also want to get the newest 
> content of file. that happens in many environments, for example, in some 
> message transmission applications. And it also requires the new content can 
> be read timely and effectively, for there maybe many tasks are working in 
> same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2018-01-08 Thread Yang Yun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16315926#comment-16315926
 ] 

Yang Yun commented on HDFS-12964:
-

the DFSSpinInputStream is only useful for special case. It will hang there when 
read or seek file if it goes beyond the current EOF and the file is opened by 
other.
The retry logic in DFSInputStream only try one more time, is different with the 
hanging. I'm not sure if we can add a flag to switch the DFSInputStream to this 
kind of behavior.

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keeps writing.  Another thread opens same 
> file and keeps reading it at the same time, also want to get the newest 
> content of file. that happens in many environments, for example, in some 
> message transmission applications. And it also requires the new content can 
> be read timely and effectively, for there maybe many tasks are working in 
> same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2018-01-05 Thread Yang Yun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312795#comment-16312795
 ] 

Yang Yun commented on HDFS-12964:
-

Yes, it's similar function with FsShel '-tail -f'.
The difference is, this is input stream API, can be used for programming, it 
doesn't request Namenode every time and keep a cache for the connection to 
Datanode.

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keeps writing.  Another thread opens same 
> file and keeps reading it at the same time, also want to get the newest 
> content of file. that happens in many environments, for example, in some 
> message transmission applications. And it also requires the new content can 
> be read timely and effectively, for there maybe many tasks are working in 
> same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2017-12-26 Thread Yang Yun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-12964:

Description: One thread opens a HDFS file and keeps writing.  Another 
thread opens same file and keeps reading it at the same time, also want to get 
the newest content of file. that happens in many environments, for example, in 
some message transmission applications. And it also requires the new content 
can be read timely and effectively, for there maybe many tasks are working in 
same time.  (was: One thread opens a HDFS file and keep writing.  Another 
thread opens same file and keep reading it at the same time, also want to get 
the newest content of file. that happens at in many environments, for example, 
in some message transmission applications. And it also requires the new content 
can be read timely and effectively, for there maybe many tasks are working in 
same time.)

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keeps writing.  Another thread opens same 
> file and keeps reading it at the same time, also want to get the newest 
> content of file. that happens in many environments, for example, in some 
> message transmission applications. And it also requires the new content can 
> be read timely and effectively, for there maybe many tasks are working in 
> same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2017-12-26 Thread Yang Yun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-12964:

Priority: Minor  (was: Critical)

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Minor
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keep writing.  Another thread opens same 
> file and keep reading it at the same time, also want to get the newest 
> content of file. that happens at in many environments, for example, in some 
> message transmission applications. And it also requires the new content can 
> be read timely and effectively, for there maybe many tasks are working in 
> same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2017-12-26 Thread Yang Yun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-12964:

Description: One thread opens a HDFS file and keep writing.  Another thread 
opens same file and keep reading it at the same time, also want to get the 
newest content of file. that happens at in many environments, for example, in 
some message transmission applications. And it also requires the new content 
can be read timely and effectively, for there maybe many tasks are working in 
same time.  (was: One thread opens a HDFS file and keep writing.  Another 
thread opens same file and keep reading it at the same time, also want to get 
the newest content of file. that happens at in many environments, for example, 
talos message transmission. And it also requires the new content can be read 
timely and effectively, for there maybe many tasks are working in same time.)

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Critical
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keep writing.  Another thread opens same 
> file and keep reading it at the same time, also want to get the newest 
> content of file. that happens at in many environments, for example, in some 
> message transmission applications. And it also requires the new content can 
> be read timely and effectively, for there maybe many tasks are working in 
> same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2017-12-26 Thread Yang Yun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-12964:

Description: One thread opens a HDFS file and keep writing.  Another thread 
opens same file and keep reading it at the same time, also want to get the 
newest content of file. that happens at in many environments, for example, 
talos message transmission. And it also requires the new content can be read 
timely and effectively, for there maybe many tasks are working in same time.  
(was: One thread opens a HDFS file and keep writing.  Another thread opens same 
file and keep reading it at the same time, also want to get the newest content 
of file. that happen at in many environments, for example, talos message 
transmission. And it also require the new content can be read timely and 
effectively, for there maybe many tasks are working in same time.)

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Critical
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keep writing.  Another thread opens same 
> file and keep reading it at the same time, also want to get the newest 
> content of file. that happens at in many environments, for example, talos 
> message transmission. And it also requires the new content can be read timely 
> and effectively, for there maybe many tasks are working in same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2017-12-26 Thread Yang Yun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-12964:

Attachment: HADOOP-12964.001.patch

Add new DFSSpinInputStream extended from DFSInputStream. If the file is being 
written by somebody else, DFSSpinInputStream will keep updating the file's 
length information if the reading goes beyond the current EOF. And also add 
cache for the requests of RefreshLocatedBlocks to Namenode and the connections 
to Datanode.

> Read a opened file timely and effectively when it's being written by other
> --
>
> Key: HDFS-12964
> URL: https://issues.apache.org/jira/browse/HDFS-12964
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Yang Yun
>Priority: Critical
> Attachments: HADOOP-12964.001.patch
>
>
> One thread opens a HDFS file and keep writing.  Another thread opens same 
> file and keep reading it at the same time, also want to get the newest 
> content of file. that happen at in many environments, for example, talos 
> message transmission. And it also require the new content can be read timely 
> and effectively, for there maybe many tasks are working in same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12964) Read a opened file timely and effectively when it's being written by other

2017-12-25 Thread Yang Yun (JIRA)
Yang Yun created HDFS-12964:
---

 Summary: Read a opened file timely and effectively when it's being 
written by other
 Key: HDFS-12964
 URL: https://issues.apache.org/jira/browse/HDFS-12964
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Yang Yun
Priority: Critical


One thread opens a HDFS file and keep writing.  Another thread opens same file 
and keep reading it at the same time, also want to get the newest content of 
file. that happen at in many environments, for example, talos message 
transmission. And it also require the new content can be read timely and 
effectively, for there maybe many tasks are working in same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    2   3   4   5   6   7