[
https://issues.apache.org/jira/browse/HDFS-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yiqun Lin updated HDFS-13678:
-----------------------------
Description:
In version 2.6.0, we supported more storage types in HDFS that implemented in
HDFS-6584. But this seems a incompatible change when we rolling upgrade our
cluster from 2.5.0 to 2.6.0 and throw following error.
{noformat}
2018-06-14 11:43:39,246 ERROR [DataNode:
[[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/,
[DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022]
org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService
for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid
ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
java.lang.ArrayStoreException
at java.util.ArrayList.toArray(ArrayList.java:412)
at java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
at
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
at java.lang.Thread.run(Thread.java:748)
{noformat}
The scenery is that old DN StorageType error that got from new NN.
I think it's will be better to return default type instead of a exception.
{code:java}
public static StorageType convertStorageType(StorageTypeProto type) {
switch(type) {
case DISK:
return StorageType.DISK;
case SSD:
return StorageType.SSD;
case ARCHIVE:
return StorageType.ARCHIVE;
case RAM_DISK:
return StorageType.RAM_DISK;
case PROVIDED:
return StorageType.PROVIDED;
default:
throw new IllegalStateException(
"BUG: StorageTypeProto not found, type=" + type);
}
}
{code}
was:
In version 2.6.0, we supported more storage types in HDFS that implemented in
HDFS-6584. But this seems a incompatible change when we rolling upgrade our
cluster from 2.5.0 to 2.6.0 and throw following error.
{noformat}
2018-06-14 11:43:39,246 ERROR [DataNode:
[[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/,
[DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022]
org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService
for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid
ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
java.lang.ArrayStoreException
at java.util.ArrayList.toArray(ArrayList.java:412)
at java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
at
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
at
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
at java.lang.Thread.run(Thread.java:748)
{noformat}
The scenery is that new version NN parses StorageType error that sent from old
version DN. This is trigger by \{{DNA_TRANSFER}} commands, that is say, if the
there are under-replicate blocks, then the error appears.
The convert logic is here:
{code:java}
public static BlockCommand convert(BlockCommandProto blkCmd) {
List<BlockProto> blockProtoList = blkCmd.getBlocksList();
Block[] blocks = new Block[blockProtoList.size()];
...
StorageType[][] targetStorageTypes = new StorageType[targetList.size()][];
List<StorageTypesProto> targetStorageTypesList =
blkCmd.getTargetStorageTypesList();
if (targetStorageTypesList.isEmpty()) { // missing storage types
for(int i = 0; i < targetStorageTypes.length; i++) {
targetStorageTypes[i] = new StorageType[targets[i].length];
Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT);
}
} else {
for(int i = 0; i < targetStorageTypes.length; i++) {
List<StorageTypeProto> p =
targetStorageTypesList.get(i).getStorageTypesList();
targetStorageTypes[i] = p.toArray(new StorageType[p.size()]); <===
should do the try-catch
}
}
{code}
A easy fix is that we do the try-catch and fallback to use the default storage
type when parsed error.
> StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions
> ---------------------------------------------------------------------
>
> Key: HDFS-13678
> URL: https://issues.apache.org/jira/browse/HDFS-13678
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: rolling upgrades
> Affects Versions: 2.5.0
> Reporter: Yiqun Lin
> Priority: Major
>
> In version 2.6.0, we supported more storage types in HDFS that implemented in
> HDFS-6584. But this seems a incompatible change when we rolling upgrade our
> cluster from 2.5.0 to 2.6.0 and throw following error.
> {noformat}
> 2018-06-14 11:43:39,246 ERROR [DataNode:
> [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/,
> [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022]
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService
> for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid
> ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
> java.lang.ArrayStoreException
> at java.util.ArrayList.toArray(ArrayList.java:412)
> at
> java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
> at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
> at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
> at
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
> at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
> at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
> at
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> The scenery is that old DN StorageType error that got from new NN.
> I think it's will be better to return default type instead of a exception.
> {code:java}
> public static StorageType convertStorageType(StorageTypeProto type) {
> switch(type) {
> case DISK:
> return StorageType.DISK;
> case SSD:
> return StorageType.SSD;
> case ARCHIVE:
> return StorageType.ARCHIVE;
> case RAM_DISK:
> return StorageType.RAM_DISK;
> case PROVIDED:
> return StorageType.PROVIDED;
> default:
> throw new IllegalStateException(
> "BUG: StorageTypeProto not found, type=" + type);
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]