[jira] [Commented] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-21 Thread Nemanja Matkovic (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967519#comment-14967519
 ] 

Nemanja Matkovic commented on HDFS-9266:


All these failures are obviously flaky tests as these didn't fail with same 
code in in patch.1 (only difference in patches is disabling two test cases if 
running in IPv4 only mode).
I ran all of these locally again to confirm (except StripedFile ones, they are 
busted due to head position) and they all passed, so I think we are good to 
commit this into branch.

> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9266-HADOOP-11890.1.patch, 
> HDFS-9266-HADOOP-11890.2.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-20 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9266:
---
Status: Patch Available  (was: Open)

> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9266-HADOOP-11890.1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-20 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9266:
---
Attachment: HDFS-9266-HADOOP-11890.1.patch

HDFS part of patch from HADOOP-12122.

> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9266-HADOOP-11890.1.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-20 Thread Nemanja Matkovic (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966165#comment-14966165
 ] 

Nemanja Matkovic commented on HDFS-9266:


For test case failures:
   - TestBlockManager.testBlocksAreNotUnderreplicatedInSingleRack -->  Same 
test case failure in same way in Hdfs-trunk build # 2448 ==> This is flaky test
   - TestNodeCount.testNodeCount --> I see same test case failure in same way 
in Hdfs-trunk build # 2448 ==> This is flaky test
   - TestRecoverStripedFile --> We are based after Erasure Encoding got merged 
into trunk, this was failing then, so we have failures here, not regression by 
these changes.
   - TestReplaceDatanodeOnFailure --> I see same test case failure in same way 
in Hdfs-trunk build # 2452 ==> This is flaky test
   - TestWriteReadStripedFile --> We are based after Erasure Encoding got 
merged into trunk, this was failing then, so we have failures here, not 
regression by these changes.
   - TestFileTruncate --> Flaky test, tracked by jira HDFS-9224
   - TestRollingUpgrade --> I see same test suite (not test case) failure in 
same way in Hdfs-trunk build # 2454 ==> Might be that this is flaky test, will 
see with next patch if it passes.
   - TestNameNodeRespectsBindHostKeys --> This one is my bad, forgot to test on 
IPv4 only machine after adding test cases, will upload new patch soon


> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9266-HADOOP-11890.1.patch, 
> HDFS-9266-HADOOP-11890.2.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-20 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9266:
---
Attachment: HDFS-9266-HADOOP-11890.2.patch

Don't validate IPv6 binding if running on IPv4 box.

> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9266-HADOOP-11890.1.patch, 
> HDFS-9266-HADOOP-11890.2.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-19 Thread Nemanja Matkovic (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963689#comment-14963689
 ] 

Nemanja Matkovic commented on HDFS-9266:


Cannot make this task to sub-task of HADOOP-11890 as this one is in HDFS, 
previous one is in hadoop-common...

> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-19 Thread Nemanja Matkovic (JIRA)
Nemanja Matkovic created HDFS-9266:
--

 Summary: hadoop-hdfs - Avoid unsafe split and append on fields 
that might be IPv6 literals
 Key: HDFS-9266
 URL: https://issues.apache.org/jira/browse/HDFS-9266
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Nemanja Matkovic
Assignee: Nemanja Matkovic






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9266) hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 literals

2015-10-19 Thread Nemanja Matkovic (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963705#comment-14963705
 ] 

Nemanja Matkovic commented on HDFS-9266:


For context on why things are broken up the way they are, see this issue

> hadoop-hdfs - Avoid unsafe split and append on fields that might be IPv6 
> literals
> -
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9026) Support for include/exclude lists on IPv6 setup

2015-09-21 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9026:
---
Attachment: HDFS-9026-HADOOP-11890.002.patch

Rename patch to match branch name.

> Support for include/exclude lists on IPv6 setup
> ---
>
> Key: HDFS-9026
> URL: https://issues.apache.org/jira/browse/HDFS-9026
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
> Environment: This affects only IPv6 cluster setup
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9026-1.patch, HDFS-9026-2.patch, 
> HDFS-9026-HADOOP-11890.002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This is a tracking item for  having e2e IPv6 support in HDFS.
> Nate did great ground work in HDFS-8078 but for having whole feature working 
> e2e this one of the items missing.
> Basically today NN won't be able to parse IPv6 addresses if they are present 
> in include or exclude list.
> Patch has a dependency (and has been tested on IPv6 only cluster) on top of 
> HDFS-8078.14.patch 
> This should be committed to HADOOP-11890 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9026) Support for include/exclude lists on IPv6 setup

2015-09-17 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9026:
---
Attachment: HDFS-9026-2.patch

Rebased on top of HADOOP-11890 branch

> Support for include/exclude lists on IPv6 setup
> ---
>
> Key: HDFS-9026
> URL: https://issues.apache.org/jira/browse/HDFS-9026
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
> Environment: This affects only IPv6 cluster setup
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9026-1.patch, HDFS-9026-2.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This is a tracking item for  having e2e IPv6 support in HDFS.
> Nate did great ground work in HDFS-8078 but for having whole feature working 
> e2e this one of the items missing.
> Basically today NN won't be able to parse IPv6 addresses if they are present 
> in include or exclude list.
> Patch has a dependency (and has been tested on IPv6 only cluster) on top of 
> HDFS-8078.14.patch 
> This should be committed to HADOOP-11890 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9026) Support for include/exclude lists on IPv6 setup

2015-09-04 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9026:
---
Status: Patch Available  (was: Open)

> Support for include/exclude lists on IPv6 setup
> ---
>
> Key: HDFS-9026
> URL: https://issues.apache.org/jira/browse/HDFS-9026
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
> Environment: This affects only IPv6 cluster setup
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This is a tracking item for  having e2e IPv6 support in HDFS.
> Nate did great ground work in HDFS-8078 but for having whole feature working 
> e2e this one of the items missing.
> Basically today NN won't be able to parse IPv6 addresses if they are present 
> in include or exclude list.
> Patch has a dependency (and has been tested on IPv6 only cluster) on top of 
> HDFS-8078.14.patch 
> This should be committed to HADOOP-11890 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9026) Support for include/exclude lists on IPv6 setup

2015-09-04 Thread Nemanja Matkovic (JIRA)
Nemanja Matkovic created HDFS-9026:
--

 Summary: Support for include/exclude lists on IPv6 setup
 Key: HDFS-9026
 URL: https://issues.apache.org/jira/browse/HDFS-9026
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
 Environment: This affects only IPv6 cluster setup
Reporter: Nemanja Matkovic
Assignee: Nemanja Matkovic


This is a tracking item for  having e2e IPv6 support in HDFS.
Nate did great ground work in HDFS-8078 but for having whole feature working 
e2e this one of the items missing.
Basically today NN won't be able to parse IPv6 addresses if they are present in 
include or exclude list.
Patch has a dependency (and has been tested on IPv6 only cluster) on top of 
HDFS-8078.14.patch 
This should be committed to HADOOP-11890 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9026) Support for include/exclude lists on IPv6 setup

2015-09-04 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-9026:
---
Attachment: HDFS-9026-1.patch

Patch for this issue, stacked on top of above mentioned HDFS-8078

> Support for include/exclude lists on IPv6 setup
> ---
>
> Key: HDFS-9026
> URL: https://issues.apache.org/jira/browse/HDFS-9026
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
> Environment: This affects only IPv6 cluster setup
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9026-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This is a tracking item for  having e2e IPv6 support in HDFS.
> Nate did great ground work in HDFS-8078 but for having whole feature working 
> e2e this one of the items missing.
> Basically today NN won't be able to parse IPv6 addresses if they are present 
> in include or exclude list.
> Patch has a dependency (and has been tested on IPv6 only cluster) on top of 
> HDFS-8078.14.patch 
> This should be committed to HADOOP-11890 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9026) Support for include/exclude lists on IPv6 setup

2015-09-04 Thread Nemanja Matkovic (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731147#comment-14731147
 ] 

Nemanja Matkovic commented on HDFS-9026:


Tagging [~nkedel] and [~eclark] as we're working together at uber jira

> Support for include/exclude lists on IPv6 setup
> ---
>
> Key: HDFS-9026
> URL: https://issues.apache.org/jira/browse/HDFS-9026
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
> Environment: This affects only IPv6 cluster setup
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>  Labels: ipv6
> Attachments: HDFS-9026-1.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This is a tracking item for  having e2e IPv6 support in HDFS.
> Nate did great ground work in HDFS-8078 but for having whole feature working 
> e2e this one of the items missing.
> Basically today NN won't be able to parse IPv6 addresses if they are present 
> in include or exclude list.
> Patch has a dependency (and has been tested on IPv6 only cluster) on top of 
> HDFS-8078.14.patch 
> This should be committed to HADOOP-11890 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-13 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-8078:
---
Description: 
1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
at java.lang.Thread.run(Thread.java:745)

Which also comes as client error -get: 2401 is not an IP string literal.

This one has existing parsing logic which needs to shift to the last colon 
rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
rather than split.  Could alternatively use the techniques above.

  was:
/patch1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 

[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-13 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-8078:
---
Description: 
/patch1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
at java.lang.Thread.run(Thread.java:745)

Which also comes as client error -get: 2401 is not an IP string literal.

This one has existing parsing logic which needs to shift to the last colon 
rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
rather than split.  Could alternatively use the techniques above.

  was:
1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at