[jira] [Commented] (HDFS-13117) Proposal to support writing replications to HDFS asynchronously

2018-02-10 Thread xuchuanyin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359764#comment-16359764
 ] 

xuchuanyin commented on HDFS-13117:
---

[~jojochuang] Actually I've tested it in a 3-node cluster. The time to copy 
file from local disk to HDFS with 3 replication is about 
{color:#FF}*300ms*{color} and the time to change the HDFS file from 
1-replica to 3-replica costs about {color:#FF}*10ms*{color} or less. (This 
does not contain the time to write local disk and the time to write first 
replica to HDFS).

 

Besides, skip writing to local disk will {color:#FF}save about 33% amount 
of disk write I/O{color}.

> Proposal to support writing replications to HDFS asynchronously
> ---
>
> Key: HDFS-13117
> URL: https://issues.apache.org/jira/browse/HDFS-13117
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: xuchuanyin
>Priority: Major
>
> My initial question was as below:
> ```
> I've learned that When We write data to HDFS using the interface provided by 
> HDFS such as 'FileSystem.create', our client will block until all the blocks 
> and their replications are done. This will cause efficiency problem if we use 
> HDFS as our final data storage. And many of my colleagues write the data to 
> local disk in the main thread and copy it to HDFS in another thread. 
> Obviously, it increases the disk I/O.
>  
>    So, is there a way to optimize this usage? I don't want to increase the 
> disk I/O, neither do I want to be blocked during the writing of extra 
> replications.
>   How about writing to HDFS by specifying only one replication in the main 
> thread and set the actual number of replication in another thread? Or is 
> there any better way to do this?
> ```
>  
> So my proposal here is to support writing extra replications to HDFS 
> asynchronously. User can set a minimum replicator as acceptable number of 
> replications ( < default or expected replicator). When writing to HDFS, user 
> will only be blocked until the minimum replicator has been finished and HDFS 
> will continue to complete the extra replications in background.Since HDFS 
> will periodically check the integrity of all the replications, we can also 
> leave this work to HDFS itself.
>  
> There are ways to provide the interfaces:
> 1. Creating a series of interfaces by adding `acceptableReplication` 
> parameter to the current interfaces as below:
> ```
> Before:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   long blockSize
> ) throws IOException
>  
> After:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   short acceptableReplication, // minimum number of replication to finish 
> before return
>   long blockSize
> ) throws IOException
> ```
>  
> 2. Adding the `acceptableReplication` and `asynchronous` to the runtime (or 
> default) configuration, so user will not have to change any interface and 
> will benefit from this feature.
>  
> How do you think about this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13117) Proposal to support writing replications to HDFS asynchronously

2018-02-08 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356722#comment-16356722
 ] 

Wei-Chiu Chuang commented on HDFS-13117:


{quote}at least the time between the last block of the first replication and 
the last block of the last replication can be saved.
{quote}
Maybe. But the latency is less than 1 ms. Note that data are written in 
512-byte chunks, not in blocks. So you basically save almost nothing.

 

If you really want to test the performance of your approach, try creating a 
file with replication=1, and then use FileSystem.setReplication() to make it 
3-replica. NameNode will then schedule for replication asynchronously. I don't 
think you'll notice much difference than writing a file with replication=3.

> Proposal to support writing replications to HDFS asynchronously
> ---
>
> Key: HDFS-13117
> URL: https://issues.apache.org/jira/browse/HDFS-13117
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: xuchuanyin
>Priority: Major
>
> My initial question was as below:
> ```
> I've learned that When We write data to HDFS using the interface provided by 
> HDFS such as 'FileSystem.create', our client will block until all the blocks 
> and their replications are done. This will cause efficiency problem if we use 
> HDFS as our final data storage. And many of my colleagues write the data to 
> local disk in the main thread and copy it to HDFS in another thread. 
> Obviously, it increases the disk I/O.
>  
>    So, is there a way to optimize this usage? I don't want to increase the 
> disk I/O, neither do I want to be blocked during the writing of extra 
> replications.
>   How about writing to HDFS by specifying only one replication in the main 
> thread and set the actual number of replication in another thread? Or is 
> there any better way to do this?
> ```
>  
> So my proposal here is to support writing extra replications to HDFS 
> asynchronously. User can set a minimum replicator as acceptable number of 
> replications ( < default or expected replicator). When writing to HDFS, user 
> will only be blocked until the minimum replicator has been finished and HDFS 
> will continue to complete the extra replications in background.Since HDFS 
> will periodically check the integrity of all the replications, we can also 
> leave this work to HDFS itself.
>  
> There are ways to provide the interfaces:
> 1. Creating a series of interfaces by adding `acceptableReplication` 
> parameter to the current interfaces as below:
> ```
> Before:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   long blockSize
> ) throws IOException
>  
> After:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   short acceptableReplication, // minimum number of replication to finish 
> before return
>   long blockSize
> ) throws IOException
> ```
>  
> 2. Adding the `acceptableReplication` and `asynchronous` to the runtime (or 
> default) configuration, so user will not have to change any interface and 
> will benefit from this feature.
>  
> How do you think about this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13117) Proposal to support writing replications to HDFS asynchronously

2018-02-07 Thread xuchuanyin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356592#comment-16356592
 ] 

xuchuanyin commented on HDFS-13117:
---

[~jojochuang] [~kihwal] Thanks for your response.

I haven't noticed the consequences of directly write files to HDFS in our apps. 
Actually we haven't tested it yet as we already knew that when client write 
data to HDFS, it will return only if the last block of last replication is done 
– at least the time between the last block of the first replication and the 
last block of the last replication can be saved.

 

Our apps are high performance in-memory-calculating related processes (written 
in C language). The app's performance of reading local file is about 3~4X 
better than that of reading HDFS file, So we are worrying about the write 
performance would encounter the same problem.

 

Now we want to make a balance between efficiency and disk write:

The process write temporary local files and copy it to HDFS in another thread 
will certainly make the process high performance but will cause more disk 
write. So I propose the above proposal.

 

I’m not afraid if I have made my idea clear...

> Proposal to support writing replications to HDFS asynchronously
> ---
>
> Key: HDFS-13117
> URL: https://issues.apache.org/jira/browse/HDFS-13117
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: xuchuanyin
>Priority: Major
>
> My initial question was as below:
> ```
> I've learned that When We write data to HDFS using the interface provided by 
> HDFS such as 'FileSystem.create', our client will block until all the blocks 
> and their replications are done. This will cause efficiency problem if we use 
> HDFS as our final data storage. And many of my colleagues write the data to 
> local disk in the main thread and copy it to HDFS in another thread. 
> Obviously, it increases the disk I/O.
>  
>    So, is there a way to optimize this usage? I don't want to increase the 
> disk I/O, neither do I want to be blocked during the writing of extra 
> replications.
>   How about writing to HDFS by specifying only one replication in the main 
> thread and set the actual number of replication in another thread? Or is 
> there any better way to do this?
> ```
>  
> So my proposal here is to support writing extra replications to HDFS 
> asynchronously. User can set a minimum replicator as acceptable number of 
> replications ( < default or expected replicator). When writing to HDFS, user 
> will only be blocked until the minimum replicator has been finished and HDFS 
> will continue to complete the extra replications in background.Since HDFS 
> will periodically check the integrity of all the replications, we can also 
> leave this work to HDFS itself.
>  
> There are ways to provide the interfaces:
> 1. Creating a series of interfaces by adding `acceptableReplication` 
> parameter to the current interfaces as below:
> ```
> Before:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   long blockSize
> ) throws IOException
>  
> After:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   short acceptableReplication, // minimum number of replication to finish 
> before return
>   long blockSize
> ) throws IOException
> ```
>  
> 2. Adding the `acceptableReplication` and `asynchronous` to the runtime (or 
> default) configuration, so user will not have to change any interface and 
> will benefit from this feature.
>  
> How do you think about this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13117) Proposal to support writing replications to HDFS asynchronously

2018-02-07 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355520#comment-16355520
 ] 

Kihwal Lee commented on HDFS-13117:
---

During writes, the data is written to multiple nodes concurrently. A client may 
experience a slowness when one of the nodes is slow (e.g. transient I/O 
overload).  But you can encounter such a node even if you write only one copy. 
Also, a single failure will fail the write permanently, which users normally 
cannot afford.  HDFS was originally designed for batch processing, but is 
widely used for more demanding environment today.  It could be a totally wrong 
choice for your app, or it could become usable by configuration changes.  More 
analysis is needed to warrant a design change.

What is your application's write performance requirement?  What are you seeing 
in your cluster? What is the version of hadoop you are running? Did you profile 
or jstack datanodes or clients, by any chance? Does the app periodically 
sync/hsync/hflush the stream? Does streams tend to hang in the middle or at the 
end of a block?  Do you see frequent pipline breakages and recoveries with the 
slowness? What is the I/O scheduler on the datanodes?

> Proposal to support writing replications to HDFS asynchronously
> ---
>
> Key: HDFS-13117
> URL: https://issues.apache.org/jira/browse/HDFS-13117
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: xuchuanyin
>Priority: Major
>
> My initial question was as below:
> ```
> I've learned that When We write data to HDFS using the interface provided by 
> HDFS such as 'FileSystem.create', our client will block until all the blocks 
> and their replications are done. This will cause efficiency problem if we use 
> HDFS as our final data storage. And many of my colleagues write the data to 
> local disk in the main thread and copy it to HDFS in another thread. 
> Obviously, it increases the disk I/O.
>  
>    So, is there a way to optimize this usage? I don't want to increase the 
> disk I/O, neither do I want to be blocked during the writing of extra 
> replications.
>   How about writing to HDFS by specifying only one replication in the main 
> thread and set the actual number of replication in another thread? Or is 
> there any better way to do this?
> ```
>  
> So my proposal here is to support writing extra replications to HDFS 
> asynchronously. User can set a minimum replicator as acceptable number of 
> replications ( < default or expected replicator). When writing to HDFS, user 
> will only be blocked until the minimum replicator has been finished and HDFS 
> will continue to complete the extra replications in background.Since HDFS 
> will periodically check the integrity of all the replications, we can also 
> leave this work to HDFS itself.
>  
> There are ways to provide the interfaces:
> 1. Creating a series of interfaces by adding `acceptableReplication` 
> parameter to the current interfaces as below:
> ```
> Before:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   long blockSize
> ) throws IOException
>  
> After:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   short acceptableReplication, // minimum number of replication to finish 
> before return
>   long blockSize
> ) throws IOException
> ```
>  
> 2. Adding the `acceptableReplication` and `asynchronous` to the runtime (or 
> default) configuration, so user will not have to change any interface and 
> will benefit from this feature.
>  
> How do you think about this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13117) Proposal to support writing replications to HDFS asynchronously

2018-02-06 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354893#comment-16354893
 ] 

Wei-Chiu Chuang commented on HDFS-13117:


Hi [~xuchuanyin] thanks for filing the jira.

May I understand what's the client application you describe here? Flume for 
example, can append to a HDFS stream continuously and there's not a "blocking" 
problem. HBase too.

>From a HDFS client perspective, you can configure the number of in-flight 
>packets during the write. If you configure it to 1 you allow 1 in-flight 
>packet and the connection appears to be "blocking". In addition, if you have 
>HDFS transport encryption enabled but you don't have hardware acceleration 
>enabled, the connection will appear to be "blocking" or slow too because of 
>the overhead of decrypt/encrypt.

 

There's also a configuration where you can allow NameNode to close a file when 
it reaches a minimal number of replicas from DataNodes. But I don't think 
that's what you are asking here.

> Proposal to support writing replications to HDFS asynchronously
> ---
>
> Key: HDFS-13117
> URL: https://issues.apache.org/jira/browse/HDFS-13117
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: xuchuanyin
>Priority: Major
>
> My initial question was as below:
> ```
> I've learned that When We write data to HDFS using the interface provided by 
> HDFS such as 'FileSystem.create', our client will block until all the blocks 
> and their replications are done. This will cause efficiency problem if we use 
> HDFS as our final data storage. And many of my colleagues write the data to 
> local disk in the main thread and copy it to HDFS in another thread. 
> Obviously, it increases the disk I/O.
>  
>    So, is there a way to optimize this usage? I don't want to increase the 
> disk I/O, neither do I want to be blocked during the writing of extra 
> replications.
>   How about writing to HDFS by specifying only one replication in the main 
> thread and set the actual number of replication in another thread? Or is 
> there any better way to do this?
> ```
>  
> So my proposal here is to support writing extra replications to HDFS 
> asynchronously. User can set a minimum replicator as acceptable number of 
> replications ( < default or expected replicator). When writing to HDFS, user 
> will only be blocked until the minimum replicator has been finished and HDFS 
> will continue to complete the extra replications in background.Since HDFS 
> will periodically check the integrity of all the replications, we can also 
> leave this work to HDFS itself.
>  
> There are ways to provide the interfaces:
> 1. Creating a series of interfaces by adding `acceptableReplication` 
> parameter to the current interfaces as below:
> ```
> Before:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   long blockSize
> ) throws IOException
>  
> After:
> FSDataOutputStream create(Path f,
>   boolean overwrite,
>   int bufferSize,
>   short replication,
>   short acceptableReplication, // minimum number of replication to finish 
> before return
>   long blockSize
> ) throws IOException
> ```
>  
> 2. Adding the `acceptableReplication` and `asynchronous` to the runtime (or 
> default) configuration, so user will not have to change any interface and 
> will benefit from this feature.
>  
> How do you think about this?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org