[jira] [Updated] (HDFS-17238) Setting the value of "dfs.blocksize" too large will cause HDFS to be unable to write to files

ECFuzz (Jira) Thu, 26 Oct 2023 00:38:06 -0700


     [ 
https://issues.apache.org/jira/browse/HDFS-17238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ECFuzz updated HDFS-17238:
--------------------------
    Description: 
My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation.

core-site.xml like below.
{code:java}
<configuration>
  <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/Mutil_Component/tmp</value>
    </property>
   
</configuration>{code}
hdfs-site.xml like below.
{code:java}
<configuration>
   <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
<property>
        <name>dfs.blocksize</name>
        <value>1342177280000</value>
    </property>
   
</configuration>{code}
And then format the namenode, and start the hdfs. HDFS is running normally.
{code:java}
hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ 
bin/hdfs namenode -format
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(many info)
hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ 
sbin/start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [hadoop-Standard-PC-i440FX-PIIX-1996] {code}
Finally, use dfs to place a file. 
{code:java}
bin/hdfs dfs -mkdir -p /user/hadoop
bin/hdfs dfs -mkdir input
bin/hdfs dfs -put etc/hadoop/*.xml input {code}
Discovering Exception Throwing.
{code:java}
2023-10-19 14:56:34,603 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be written to 0 
of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) 
are excluded in this operation.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2350)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048)        at 
org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567)
        at org.apache.hadoop.ipc.Client.call(Client.java:1513)
        at org.apache.hadoop.ipc.Client.call(Client.java:1410)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:531)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
        at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088)
        at 
org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1915)
        at 
org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1717)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:713)
put: File /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be 
written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 
1 node(s) are excluded in this operation. {code}
analyze

The error message implies that the HDFS system was unable to replicate the file 
to the minimum number of nodes required. When the value of the parameter 
"dfs.blocksize" is set too large, it will take up more space, thus limiting the 
number of block copies in the distributed system.

 

 

  was:
My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation.

core-site.xml like below.
{code:java}
<configuration>
  <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/Mutil_Component/tmp</value>
    </property>
   
</configuration>{code}
hdfs-site.xml like below.
{code:java}
<configuration>
   <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
<property>
        <name>dfs.blocksize</name>
        <value>1342177280000</value>
    </property>
   
</configuration>{code}
And then format the namenode, and start the hdfs. HDFS is running normally.
{code:java}
hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ 
bin/hdfs namenode -format
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(many info)
hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ 
sbin/start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [hadoop-Standard-PC-i440FX-PIIX-1996] {code}
Finally, use dfs to place a file. 
{code:java}
bin/hdfs dfs -mkdir -p /user/hadoop
bin/hdfs dfs -mkdir input
bin/hdfs dfs -put etc/hadoop/*.xml input {code}
Discovering Exception Throwing.
{code:java}
2023-10-19 14:56:34,603 WARN hdfs.DataStreamer: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be written to 0 
of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) 
are excluded in this operation.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2350)
        at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048)        at 
org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567)
        at org.apache.hadoop.ipc.Client.call(Client.java:1513)
        at org.apache.hadoop.ipc.Client.call(Client.java:1410)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:531)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
        at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088)
        at 
org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1915)
        at 
org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1717)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:713)
put: File /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be 
written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 
1 node(s) are excluded in this operation. {code}
analyze

The error message implies that the HDFS system was unable to replicate the file 
to the minimum number of nodes required. 

 

 


> Setting the value of "dfs.blocksize" too large will cause HDFS to be unable 
> to write to files
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-17238
>                 URL: https://issues.apache.org/jira/browse/HDFS-17238
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.3.6
>            Reporter: ECFuzz
>            Priority: Major
>
> My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation.
> core-site.xml like below.
> {code:java}
> <configuration>
>   <property>
>         <name>fs.defaultFS</name>
>         <value>hdfs://localhost:9000</value>
>     </property>
>     <property>
>         <name>hadoop.tmp.dir</name>
>         <value>/home/hadoop/Mutil_Component/tmp</value>
>     </property>
>    
> </configuration>{code}
> hdfs-site.xml like below.
> {code:java}
> <configuration>
>    <property>
>         <name>dfs.replication</name>
>         <value>1</value>
>     </property>
> <property>
>         <name>dfs.blocksize</name>
>         <value>1342177280000</value>
>     </property>
>    
> </configuration>{code}
> And then format the namenode, and start the hdfs. HDFS is running normally.
> {code:java}
> hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ 
> bin/hdfs namenode -format
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(many info)
> hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ 
> sbin/start-dfs.sh
> Starting namenodes on [localhost]
> Starting datanodes
> Starting secondary namenodes [hadoop-Standard-PC-i440FX-PIIX-1996] {code}
> Finally, use dfs to place a file. 
> {code:java}
> bin/hdfs dfs -mkdir -p /user/hadoop
> bin/hdfs dfs -mkdir input
> bin/hdfs dfs -put etc/hadoop/*.xml input {code}
> Discovering Exception Throwing.
> {code:java}
> 2023-10-19 14:56:34,603 WARN hdfs.DataStreamer: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
> /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be written to 
> 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 
> node(s) are excluded in this operation.
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2350)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048)        
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1513)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1410)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:531)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>         at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088)
>         at 
> org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1915)
>         at 
> org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1717)
>         at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:713)
> put: File /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be 
> written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running 
> and 1 node(s) are excluded in this operation. {code}
> analyze
> The error message implies that the HDFS system was unable to replicate the 
> file to the minimum number of nodes required. When the value of the parameter 
> "dfs.blocksize" is set too large, it will take up more space, thus limiting 
> the number of block copies in the distributed system.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDFS-17238) Setting the value of "dfs.blocksize" too large will cause HDFS to be unable to write to files

Reply via email to