[jira] [Commented] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2018-03-01 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383063#comment-16383063
 ] 

Huafeng Wang commented on HDFS-12257:
-

Hi [~Sammi], I'm not sure about that. The patch hasn't been reviewed and looks 
like it has conflicts with trunk now so it has to be revised. I can try to 
update the patch but I'm afraid it will take few days.

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>Priority: Major
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch, 
> HDFS-12257.003.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-11-03 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237274#comment-16237274
 ] 

Huafeng Wang commented on HDFS-11467:
-

The failed tests are irrelevant and they all passed locally. 

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch, 
> HDFS-11467.003.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-11-03 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237146#comment-16237146
 ] 

Huafeng Wang commented on HDFS-11467:
-

Hi [~xiaochen], I just uploaded a new patch against the latest trunk. Please 
help to review it.

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch, 
> HDFS-11467.003.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-11-03 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11467:

Attachment: HDFS-11467.003.patch

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch, 
> HDFS-11467.003.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-11-02 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235339#comment-16235339
 ] 

Huafeng Wang commented on HDFS-11467:
-

Hi Andrew, I'm working on it and I'll post an updated patch ASAP.

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-23 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216339#comment-16216339
 ] 

Huafeng Wang commented on HDFS-11467:
-

Thanks [~xiaochen] for your clarification, I'll update my patch once HDFS-12682 
is merged. 

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-23 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216250#comment-16216250
 ] 

Huafeng Wang commented on HDFS-11467:
-

Hi [~xiaochen], thanks for your review.
Sorry I don't fully get your second point. AFAIK fsimage is the snapshot of 
namespace so we only need a disabled, an enabled and a removed ec policy  to 
test the section serialization/deserialization. The combinations of state 
exchange make no difference here. Please correct me if I am wrong. 
And also, I think now this issue is kind of blocked by HDFS-12682 since the 
serialized ec policy will not have the right state so the test can not pass.


> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-16 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16205654#comment-16205654
 ] 

Huafeng Wang commented on HDFS-11467:
-

Thanks [~jojochuang], [~Sammi] for your reviews, I just updated my patch 
according to your comments.

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-16 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11467:

Attachment: HDFS-11467.002.patch

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11467:

Attachment: HDFS-11467.001.patch

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-11467:

Status: Patch Available  (was: Open)

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202993#comment-16202993
 ] 

Huafeng Wang commented on HDFS-11467:
-

As discussed with Wei offline, I'll take this one.

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei Zhou
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-11467:
---

Assignee: Huafeng Wang  (was: Wei Zhou)

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12635) Unnecessary exception declaration of the CellBuffers constructor

2017-10-11 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199943#comment-16199943
 ] 

Huafeng Wang commented on HDFS-12635:
-

Thanks Kai for your review!

> Unnecessary exception declaration of the CellBuffers constructor
> 
>
> Key: HDFS-12635
> URL: https://issues.apache.org/jira/browse/HDFS-12635
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-12635.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12633) Unnecessary exception declaration of the CellBuffers constructor

2017-10-10 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang resolved HDFS-12633.
-
Resolution: Duplicate

> Unnecessary exception declaration of the CellBuffers constructor
> 
>
> Key: HDFS-12633
> URL: https://issues.apache.org/jira/browse/HDFS-12633
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12635) Unnecessary exception declaration of the CellBuffers constructor

2017-10-10 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12635:

Status: Patch Available  (was: Open)

> Unnecessary exception declaration of the CellBuffers constructor
> 
>
> Key: HDFS-12635
> URL: https://issues.apache.org/jira/browse/HDFS-12635
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12635.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12635) Unnecessary exception declaration of the CellBuffers constructor

2017-10-10 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12635:

Attachment: HDFS-12635.001.patch

> Unnecessary exception declaration of the CellBuffers constructor
> 
>
> Key: HDFS-12635
> URL: https://issues.apache.org/jira/browse/HDFS-12635
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12635.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12635) Unnecessary exception declaration of the CellBuffers constructor

2017-10-10 Thread Huafeng Wang (JIRA)
Huafeng Wang created HDFS-12635:
---

 Summary: Unnecessary exception declaration of the CellBuffers 
constructor
 Key: HDFS-12635
 URL: https://issues.apache.org/jira/browse/HDFS-12635
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Huafeng Wang
Assignee: Huafeng Wang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12633) Unnecessary exception declaration of the CellBuffers constructor

2017-10-10 Thread Huafeng Wang (JIRA)
Huafeng Wang created HDFS-12633:
---

 Summary: Unnecessary exception declaration of the CellBuffers 
constructor
 Key: HDFS-12633
 URL: https://issues.apache.org/jira/browse/HDFS-12633
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Huafeng Wang
Assignee: Huafeng Wang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-10-10 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12497:

Attachment: HDFS-12497.004.patch

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch, 
> HDFS-12497.003.patch, HDFS-12497.004.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-10-10 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198299#comment-16198299
 ] 

Huafeng Wang commented on HDFS-12257:
-

Hi [~asuresh], about next major release, do you mean 3.0 release? I'm OK with 
porting to 2.X versions since the patch is basically adding a new API so it 
won't be much trouble.

[~andrew.wang], any comment on this one? I'll correct the check styles along 
with the modification according to the later comments.

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch, 
> HDFS-12257.003.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-10-09 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198069#comment-16198069
 ] 

Huafeng Wang commented on HDFS-12497:
-

Hi Andrew, 
{quote}
The exception being thrown in DFSStripedOutputStream
{quote}
The constructor of inner class {{CellBuffers}} in {{DFSStripedOutputStream}} 
declares throwing InterruptedException while the actual code will not throw 
that exception so I removed it.

{quote}
Removing logging from TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
{quote}
The constructor is not needed anymore so I also removed the logging part. I can 
add back if this log is necessary.

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch, 
> HDFS-12497.003.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-10-08 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196428#comment-16196428
 ] 

Huafeng Wang commented on HDFS-12497:
-

Hi [~andrew.wang], any comment on this one?

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch, 
> HDFS-12497.003.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-29 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12497:

Attachment: HDFS-12497.003.patch

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch, 
> HDFS-12497.003.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-29 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185412#comment-16185412
 ] 

Huafeng Wang commented on HDFS-12497:
-

Hi [~Sammi] [~andrew.wang], I found decreasing the number of stripes of a block 
cannot totally solve the timeout issue. It will also impact the original test 
cases. In original {{TestDFSStripedOutputStreamWithFailure}}, it generates a 
list of 216(3 * 4 * 6 * 3) different file lengths and the subclass will choose 
different file length to run the test. Setting stripes of a block to 2 will 
make some test cases actually invalid. 

So I proposed to decrease the cell size in these test cases. I'll give it a try 
and see whether the timeout issue can be eliminated.

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-28 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185286#comment-16185286
 ] 

Huafeng Wang commented on HDFS-12257:
-

Hi Andrew, I just added a new API as you proposed. Please help to review the 
new patch.

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch, 
> HDFS-12257.003.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-28 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12257:

Attachment: HDFS-12257.003.patch

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch, 
> HDFS-12257.003.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-28 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12497:

Attachment: HDFS-12497.002.patch

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-27 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12497:

Status: Patch Available  (was: Open)

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-27 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12497:
---

Assignee: Huafeng Wang  (was: SammiChen)

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-09-27 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12497:

Attachment: HDFS-12497.001.patch

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Attachments: HDFS-12497.001.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-26 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181865#comment-16181865
 ] 

Huafeng Wang commented on HDFS-12257:
-

Hi Andrew, right now the DistributributedFileSystem and DFSClient both have the 
API that returns an array. I agree with you and if we proceed with your idea, 
should we deprecate these old API?

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12534) Provide logical BlockLocations for EC files for better split calculation

2017-09-24 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178479#comment-16178479
 ] 

Huafeng Wang commented on HDFS-12534:
-

Hi [~andrew.wang], I have a question here. 
{quote}
Applications depend on HDFS BlockLocation to understand where the split points 
are.
{quote}
I think currently the returned logical BlockLocation per block group has all 
the data block and parity block's locations. Isn't these information enough? 
What's the difference between splitting a single block group and multiple 
logical block locations here? 


> Provide logical BlockLocations for EC files for better split calculation
> 
>
> Key: HDFS-12534
> URL: https://issues.apache.org/jira/browse/HDFS-12534
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>  Labels: hdfs-ec-3.0-must-do
>
> I talked to [~vanzin] and [~alex.behm] some more about split calculation with 
> EC. It turns out HDFS-1 was resolved prematurely. Applications depend on 
> HDFS BlockLocation to understand where the split points are. The current 
> scheme of returning one BlockLocation per block group loses this information.
> We should change this to provide logical blocks. Divide the file length by 
> the block size and provide suitable BlockLocations to match, with virtual 
> offsets and lengths too.
> I'm not marking this as incompatible, since changing it this way would in 
> fact make it more compatible from the perspective of applications that are 
> scheduling against replicated files. Thus, it'd be good for beta1 if 
> possible, but okay for later too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-24 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178467#comment-16178467
 ] 

Huafeng Wang commented on HDFS-12257:
-

Hi [~andrew.wang], can you help to take a look at this one?

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-21 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175941#comment-16175941
 ] 

Huafeng Wang commented on HDFS-12257:
-

Hi [~msingh], thanks for your review and I just updated the patch.

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12257:

Attachment: HDFS-12257.002.patch

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12257:

Status: Patch Available  (was: Open)

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12257:

Attachment: HDFS-12257.001.patch

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12257:
---

Assignee: Huafeng Wang

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12523) Thread pools in ErasureCodingWorker do not shutdown

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12523:

Attachment: HDFS-12523.002.patch

> Thread pools in ErasureCodingWorker do not shutdown
> ---
>
> Key: HDFS-12523
> URL: https://issues.apache.org/jira/browse/HDFS-12523
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha4
>Reporter: Lei (Eddy) Xu
>Assignee: Huafeng Wang
> Attachments: HDFS-12523.001.patch, HDFS-12523.002.patch
>
>
> There is no code path in {{ErasureCodingWorker}} to shutdown its two thread 
> pools: {{stripedReconstructionPool}} and {{stripedReadPool}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12448) Make sure user defined erasure coding policy ID will not overflow

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12448:

Attachment: HDFS-12448.002.patch

> Make sure user defined erasure coding policy ID will not overflow
> -
>
> Key: HDFS-12448
> URL: https://issues.apache.org/jira/browse/HDFS-12448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: SammiChen
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12448.001.patch, HDFS-12448.002.patch
>
>
> Current policy ID is of type "byte".  1~63 is reserved for built-in erasure 
> coding policy. 64 above is for user defined erasure coding policy. Make sure 
> user policy ID will not overflow when addErasureCodingPolicy API is called. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12448) Make sure user defined erasure coding policy ID will not overflow

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12448:

Attachment: HDFS-12448.001.patch

> Make sure user defined erasure coding policy ID will not overflow
> -
>
> Key: HDFS-12448
> URL: https://issues.apache.org/jira/browse/HDFS-12448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
> Environment: Current policy ID is of type "byte".  1~63 is reserved 
> for built-in erasure coding policy. 64 above is for user defined erasure 
> coding policy. Make sure user policy ID will not overflow when 
> addErasureCodingPolicy API is called. 
>Reporter: SammiChen
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12448.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12448) Make sure user defined erasure coding policy ID will not overflow

2017-09-21 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12448:

Status: Patch Available  (was: Open)

> Make sure user defined erasure coding policy ID will not overflow
> -
>
> Key: HDFS-12448
> URL: https://issues.apache.org/jira/browse/HDFS-12448
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
> Environment: Current policy ID is of type "byte".  1~63 is reserved 
> for built-in erasure coding policy. 64 above is for user defined erasure 
> coding policy. Make sure user policy ID will not overflow when 
> addErasureCodingPolicy API is called. 
>Reporter: SammiChen
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12448.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12523) Thread pools in ErasureCodingWorker do not shutdown

2017-09-20 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12523:

Status: Patch Available  (was: Open)

> Thread pools in ErasureCodingWorker do not shutdown
> ---
>
> Key: HDFS-12523
> URL: https://issues.apache.org/jira/browse/HDFS-12523
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha4
>Reporter: Lei (Eddy) Xu
>Assignee: Huafeng Wang
> Attachments: HDFS-12523.001.patch
>
>
> There is no code path in {{ErasureCodingWorker}} to shutdown its two thread 
> pools: {{stripedReconstructionPool}} and {{stripedReadPool}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12523) Thread pools in ErasureCodingWorker do not shutdown

2017-09-20 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12523:

Attachment: HDFS-12523.001.patch

> Thread pools in ErasureCodingWorker do not shutdown
> ---
>
> Key: HDFS-12523
> URL: https://issues.apache.org/jira/browse/HDFS-12523
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha4
>Reporter: Lei (Eddy) Xu
>Assignee: Huafeng Wang
> Attachments: HDFS-12523.001.patch
>
>
> There is no code path in {{ErasureCodingWorker}} to shutdown its two thread 
> pools: {{stripedReconstructionPool}} and {{stripedReadPool}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12523) Thread pools in ErasureCodingWorker do not shutdown

2017-09-20 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12523:
---

Assignee: Huafeng Wang

> Thread pools in ErasureCodingWorker do not shutdown
> ---
>
> Key: HDFS-12523
> URL: https://issues.apache.org/jira/browse/HDFS-12523
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha4
>Reporter: Lei (Eddy) Xu
>Assignee: Huafeng Wang
>
> There is no code path in {{ErasureCodingWorker}} to shutdown its two thread 
> pools: {{stripedReconstructionPool}} and {{stripedReadPool}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12479) Some misuses of lock in DFSStripedOutputStream

2017-09-18 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169861#comment-16169861
 ] 

Huafeng Wang commented on HDFS-12479:
-

Hi [~drankye], can you help to review this patch? Thanks!

> Some misuses of lock in DFSStripedOutputStream
> --
>
> Key: HDFS-12479
> URL: https://issues.apache.org/jira/browse/HDFS-12479
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12479.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12413) Inotify should support erasure coding policy op as replica meta change

2017-09-18 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang resolved HDFS-12413.
-
Resolution: Not A Problem

> Inotify should support erasure coding policy op as replica meta change
> --
>
> Key: HDFS-12413
> URL: https://issues.apache.org/jira/browse/HDFS-12413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
>
> Currently HDFS Inotify already supports meta change like replica for a file. 
> We should also support erasure coding policy setting/unsetting for a file 
> similarly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-18 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169768#comment-16169768
 ] 

Huafeng Wang commented on HDFS-12398:
-

Hi [~andrew.wang], sorry I don't fully get your idea. Splitting the tests into 
subclasses cannot reduce the duplication and most of them come from the 
different file sizes. Did I miss anything?
And I also noticed the same duplication in TestDFSStripedOutputStream.

> Use JUnit Paramaterized test suite in TestWriteReadStripedFile
> --
>
> Key: HDFS-12398
> URL: https://issues.apache.org/jira/browse/HDFS-12398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Trivial
>  Labels: flaky-test
> Attachments: HDFS-12398.001.patch, HDFS-12398.002.patch
>
>
> The TestWriteReadStripedFile is basically doing the full product of file size 
> with data node failure or not. It's better to use JUnit Paramaterized test 
> suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12479) Some misuses of lock in DFSStripedOutputStream

2017-09-18 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12479:

Status: Patch Available  (was: Open)

> Some misuses of lock in DFSStripedOutputStream
> --
>
> Key: HDFS-12479
> URL: https://issues.apache.org/jira/browse/HDFS-12479
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12479.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12479) Some misuses of lock in DFSStripedOutputStream

2017-09-18 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12479:

Attachment: HDFS-12479.001.patch

> Some misuses of lock in DFSStripedOutputStream
> --
>
> Key: HDFS-12479
> URL: https://issues.apache.org/jira/browse/HDFS-12479
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12479.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12479) Some misuses of lock in DFSStripedOutputStream

2017-09-18 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169758#comment-16169758
 ] 

Huafeng Wang commented on HDFS-12479:
-

# In {{MultipleBlockingQueue}}, the underlying list is immutable so there will 
be no concurrent modification and lock here is not needed.
# In {{Coordinator}}, {{ConcurrentHashMap}} will have better performance than 
{{Collections.synchronizedMap}}.

> Some misuses of lock in DFSStripedOutputStream
> --
>
> Key: HDFS-12479
> URL: https://issues.apache.org/jira/browse/HDFS-12479
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12479) Some misuses of lock in DFSStripedOutputStream

2017-09-18 Thread Huafeng Wang (JIRA)
Huafeng Wang created HDFS-12479:
---

 Summary: Some misuses of lock in DFSStripedOutputStream
 Key: HDFS-12479
 URL: https://issues.apache.org/jira/browse/HDFS-12479
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Huafeng Wang
Assignee: Huafeng Wang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12444) Reduce runtime of TestWriteReadStripedFile

2017-09-17 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169508#comment-16169508
 ] 

Huafeng Wang commented on HDFS-12444:
-

[~drankye] the TODO mark is already removed.

> Reduce runtime of TestWriteReadStripedFile
> --
>
> Key: HDFS-12444
> URL: https://issues.apache.org/jira/browse/HDFS-12444
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, test
>Affects Versions: 3.0.0-alpha4
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12444.001.patch, HDFS-12444.002.patch, 
> HDFS-12444.003.patch
>
>
> This test takes a long time to run since it writes a lot of data, and 
> frequently times out during precommit testing. If we change the EC policy 
> from RS(6,3) to RS(3,2) then it will run a lot faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12444) Reduce runtime of TestWriteReadStripedFile

2017-09-15 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12444:

Attachment: HDFS-12444.003.patch

> Reduce runtime of TestWriteReadStripedFile
> --
>
> Key: HDFS-12444
> URL: https://issues.apache.org/jira/browse/HDFS-12444
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, test
>Affects Versions: 3.0.0-alpha4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-12444.001.patch, HDFS-12444.002.patch, 
> HDFS-12444.003.patch
>
>
> This test takes a long time to run since it writes a lot of data, and 
> frequently times out during precommit testing. If we change the EC policy 
> from RS(6,3) to RS(3,2) then it will run a lot faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12444) Reduce runtime of TestWriteReadStripedFile

2017-09-14 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165971#comment-16165971
 ] 

Huafeng Wang commented on HDFS-12444:
-

Updated the patch according to Kai's suggestion.

> Reduce runtime of TestWriteReadStripedFile
> --
>
> Key: HDFS-12444
> URL: https://issues.apache.org/jira/browse/HDFS-12444
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, test
>Affects Versions: 3.0.0-alpha4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-12444.001.patch, HDFS-12444.002.patch
>
>
> This test takes a long time to run since it writes a lot of data, and 
> frequently times out during precommit testing. If we change the EC policy 
> from RS(6,3) to RS(3,2) then it will run a lot faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12444) Reduce runtime of TestWriteReadStripedFile

2017-09-14 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12444:

Attachment: HDFS-12444.002.patch

> Reduce runtime of TestWriteReadStripedFile
> --
>
> Key: HDFS-12444
> URL: https://issues.apache.org/jira/browse/HDFS-12444
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, test
>Affects Versions: 3.0.0-alpha4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-12444.001.patch, HDFS-12444.002.patch
>
>
> This test takes a long time to run since it writes a lot of data, and 
> frequently times out during precommit testing. If we change the EC policy 
> from RS(6,3) to RS(3,2) then it will run a lot faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12413) Inotify should support erasure coding policy op as replica meta change

2017-09-13 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165688#comment-16165688
 ] 

Huafeng Wang commented on HDFS-12413:
-

I looked into the code, actually the setting/unsetting erasure code policy for 
files are also returned in inotify streams. They are represented as 
{{MetadataUpdateEvent}} and the MetadataType is {{XATTRS}}.

> Inotify should support erasure coding policy op as replica meta change
> --
>
> Key: HDFS-12413
> URL: https://issues.apache.org/jira/browse/HDFS-12413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
>
> Currently HDFS Inotify already supports meta change like replica for a file. 
> We should also support erasure coding policy setting/unsetting for a file 
> similarly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12414) Ensure to use CLI command to enable/disable erasure coding policy

2017-09-13 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164188#comment-16164188
 ] 

Huafeng Wang commented on HDFS-12414:
-

non-binding +1

> Ensure to use CLI command to enable/disable erasure coding policy
> -
>
> Key: HDFS-12414
> URL: https://issues.apache.org/jira/browse/HDFS-12414
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-12414.001.patch, HDFS-12414.002.patch, 
> HDFS-12414.003.patch
>
>
> Currently, there are two methods for user to enable/disable a erasure coding 
> policy. One is through "dfs.namenode.ec.policies.enabled" property which is a 
> static way to configure the enabled erasure coding policies. Another is 
> through "enableErasureCodingPolicy" or "disabledErasureCodingPolicy" API 
> which can enabled or disable erasure coding policy at runtime. 
> When Namenode restart, there is potential state conflicts between the policy 
> defined in "dfs.namenode.ec.policies.enabled" and policy saved in fsImage. To 
> resolve the conflict and simplify the operation, it's better to use just one 
> way and remove the old method configuring the property.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Document and test BlockLocation for erasure-coded files

2017-09-12 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16163998#comment-16163998
 ] 

Huafeng Wang commented on HDFS-1:
-

Thanks [~andrew.wang] for your advice and help!

> Document and test BlockLocation for erasure-coded files
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch, HDFS-1.005.patch, 
> HDFS-1.006.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12405) Clean up removed erasure coding policies from namenode

2017-09-12 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16162668#comment-16162668
 ] 

Huafeng Wang commented on HDFS-12405:
-

I got few questions about this issue. Why do we have to clean up the removed 
policies? I think NameNode's restart is not frequently enough so clean up at 
that time can only cover a little portion of policies, so clean up them when 
NameNode restart would suffice? 

> Clean up removed erasure coding policies from namenode
> --
>
> Key: HDFS-12405
> URL: https://issues.apache.org/jira/browse/HDFS-12405
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: SammiChen
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Currently, when an erasure coding policy is removed, it's been transited to 
> "removed" state. User cannot apply policy with "removed" state to 
> file/directory anymore.  The policy cannot be safely removed from the system 
> unless we know there are no existing files or directories that use this 
> "removed" policy. To find out whether there are files or directories which 
> are using the policy is time consuming in runtime and might impact the 
> Namenode performance. So a better choice is doing the work when NameNode 
> restarts and loads Inodes. Collecting the information at that time will not 
> introduce much extra overhead. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-09-12 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Attachment: HDFS-1.006.patch

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch, HDFS-1.005.patch, 
> HDFS-1.006.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12413) Inotify should support erasure coding policy op as replica meta change

2017-09-10 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12413:
---

Assignee: Huafeng Wang

> Inotify should support erasure coding policy op as replica meta change
> --
>
> Key: HDFS-12413
> URL: https://issues.apache.org/jira/browse/HDFS-12413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
>
> Currently HDFS Inotify already supports meta change like replica for a file. 
> We should also support erasure coding policy setting/unsetting for a file 
> similarly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-10 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160561#comment-16160561
 ] 

Huafeng Wang commented on HDFS-12398:
-

Hi [~drankye], very thanks for your review.
{quote}
1. The current way of having many test methods are much better readable;
{quote}
It's true, I can add some comments on the parameters if you wish. But I think 
currently the ec file names also can tell what the test is doing.
{quote}
2. It's also easier to debug if some of them are failed;
{quote}
It's also true and it's the limitation of junit Parameterized. 
{quote}
3. More important, every test case (contained in a test method) needs a brand 
new cluster to start with;
{quote}
It's intended because in each test, it will randomly kill a datanode so start 
with a new cluster is needed.
{quote}
4. Timeout can be fine-tuned for each test method in current way.
{quote}
It's not true, before the refactor, the timeout is controlled by
{code}
@Rule
public Timeout globalTimeout = new Timeout(30)
{code}
which applies the same timeout to all test methods in a class. 


> Use JUnit Paramaterized test suite in TestWriteReadStripedFile
> --
>
> Key: HDFS-12398
> URL: https://issues.apache.org/jira/browse/HDFS-12398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Trivial
> Attachments: HDFS-12398.001.patch, HDFS-12398.002.patch
>
>
> The TestWriteReadStripedFile is basically doing the full product of file size 
> with data node failure or not. It's better to use JUnit Paramaterized test 
> suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-08 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12398:

Attachment: HDFS-12398.002.patch

> Use JUnit Paramaterized test suite in TestWriteReadStripedFile
> --
>
> Key: HDFS-12398
> URL: https://issues.apache.org/jira/browse/HDFS-12398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Trivial
> Attachments: HDFS-12398.001.patch, HDFS-12398.002.patch
>
>
> The TestWriteReadStripedFile is basically doing the full product of file size 
> with data node failure or not. It's better to use JUnit Paramaterized test 
> suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12405) Clean up removed erasure coding policies from namenode

2017-09-08 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12405:
---

Assignee: Huafeng Wang

> Clean up removed erasure coding policies from namenode
> --
>
> Key: HDFS-12405
> URL: https://issues.apache.org/jira/browse/HDFS-12405
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: SammiChen
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Currently, when an erasure coding policy is removed, it's been transited to 
> "removed" state. User cannot apply policy with "removed" state to 
> file/directory anymore.  The policy cannot be safely removed from the system 
> unless we know there are no existing files or directories that use this 
> "removed" policy. To find out whether there are files or directories which 
> are using the policy is time consuming in runtime and might impact the 
> Namenode performance. So a better choice is doing the work when NameNode 
> restarts and loads Inodes. Collecting the information at that time will not 
> introduce much extra overhead. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-07 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12398:

Attachment: HDFS-12398.001.patch

> Use JUnit Paramaterized test suite in TestWriteReadStripedFile
> --
>
> Key: HDFS-12398
> URL: https://issues.apache.org/jira/browse/HDFS-12398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Trivial
> Attachments: HDFS-12398.001.patch
>
>
> The TestWriteReadStripedFile is basically doing the full product of file size 
> with data node failure or not. It's better to use JUnit Paramaterized test 
> suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-07 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12398:

Status: Patch Available  (was: Open)

> Use JUnit Paramaterized test suite in TestWriteReadStripedFile
> --
>
> Key: HDFS-12398
> URL: https://issues.apache.org/jira/browse/HDFS-12398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Trivial
> Attachments: HDFS-12398.001.patch
>
>
> The TestWriteReadStripedFile is basically doing the full product of file size 
> with data node failure or not. It's better to use JUnit Paramaterized test 
> suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-09-07 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Attachment: HDFS-1.005.patch

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch, HDFS-1.005.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-09-07 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16158013#comment-16158013
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi Andrew, I agree with you. I'll update the patch soon.

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-09-07 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156634#comment-16156634
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi [~andrew.wang], thanks for your review! I just uploaded a new patch. In this 
patch I mainly:
* Removed the getECBlockLocation function and ECBlockLocation class.
* Fixed {{getFileBlockLocation}} of DFSClient.
* Add comments about {{getFileBlockLocation}}, {{listFiles}} and 
{{listLocatedStatus}} in {{FileSystem}}, {{DistributedFileSystem}} and 
{{FileContext}}
* Add comments about {{makeQualifiedLocated}} in {{HdfsLocatedFileStatus}}
* Add tests for {{DistributedFileSystem.getFileBlockLocation}}, 
{{DistributedFileSystem.listFiles}}, {{FileContext.getFileBlockLocation}} and 
{{FileContext.listFiles}} in case of ec with various file size.

And about 
{quote}
Could you verify that fsck -files -blocks -locations still returns parity 
blocks?
{quote}

I checked the output of {{fsck -files -blocks -locations}}, it does not have 
very detailed block location info of an erasure coded file. An output example 
of a 6+3 eraure coded file will be like 
{code}
0. BP-417570284-10.239.160.132-1504687036886:blk_-9223372036854775792_1001 
len=6291456 Live_repl=9  
[blk_-9223372036854775792:DatanodeInfoWithStorage[127.0.0.1:54859,DS-09a24593-5cbc-444c-ad43-ab1b39c65887,DISK](LIVE),
 
blk_-9223372036854775791:DatanodeInfoWithStorage[127.0.0.1:54863,DS-80d7a2bb-5acc-437c-936a-bd28314e2a8c,DISK](LIVE),
 
blk_-9223372036854775790:DatanodeInfoWithStorage[127.0.0.1:54883,DS-05a880c7-0fa2-4683-a382-06ec7d975fd3,DISK](LIVE),
 
blk_-9223372036854775789:DatanodeInfoWithStorage[127.0.0.1:54854,DS-8a5cf2da-1c7e-4942-b57c-8755ddb3cfcb,DISK](LIVE),
 
blk_-9223372036854775788:DatanodeInfoWithStorage[127.0.0.1:54871,DS-95c64656-3131-413c-b400-0f14612b387d,DISK](LIVE),
 
blk_-9223372036854775787:DatanodeInfoWithStorage[127.0.0.1:54867,DS-fbf6ea90-8829-44ce-8681-b5f53be726c1,DISK](STALE_BLOCK_CONTENT),
 
blk_-9223372036854775786:DatanodeInfoWithStorage[127.0.0.1:54875,DS-d40bfede-c5c9-4cb0-8b5e-92ead1bbb4da,DISK](LIVE),
 
blk_-9223372036854775785:DatanodeInfoWithStorage[127.0.0.1:54879,DS-c999124f-3d0e-4f6c-bd31-5f0fdff86fca,DISK](STALE_BLOCK_CONTENT),
 
blk_-9223372036854775784:DatanodeInfoWithStorage[127.0.0.1:54850,DS-7ff8f0ed-b62a-40a9-8966-b16f71532712,DISK](LIVE)]
{code} 

So you mean we should also remove the parity blocks info?

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-12222) Add EC information to BlockLocation

2017-09-07 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Comment: was deleted

(was: Hi [~andrew.wang], thanks for your review! I just uploaded a new patch. 
In this patch I mainly:
* Removed the getECBlockLocation function and ECBlockLocation class.
* Fixed getFileBlockLocation of DFSClient.
* Add comments for {{getFileBlockLocation}}, {{listFiles}} and 
{{listLocatedStatus}})

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-09-07 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156621#comment-16156621
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi [~andrew.wang], thanks for your review! I just uploaded a new patch. In this 
patch I mainly:
* Removed the getECBlockLocation function and ECBlockLocation class.
* Fixed getFileBlockLocation of DFSClient.
* Add comments for {{getFileBlockLocation}}, {{listFiles}} and 
{{listLocatedStatus}}

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-09-07 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Attachment: HDFS-1.004.patch

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch, HDFS-1.004.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-06 Thread Huafeng Wang (JIRA)
Huafeng Wang created HDFS-12398:
---

 Summary: Use JUnit Paramaterized test suite in 
TestWriteReadStripedFile
 Key: HDFS-12398
 URL: https://issues.apache.org/jira/browse/HDFS-12398
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Reporter: Huafeng Wang
Priority: Trivial


The TestWriteReadStripedFile is basically doing the full product of file size 
with data node failure or not. It's better to use JUnit Paramaterized test 
suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12398) Use JUnit Paramaterized test suite in TestWriteReadStripedFile

2017-09-06 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12398:
---

Assignee: Huafeng Wang

> Use JUnit Paramaterized test suite in TestWriteReadStripedFile
> --
>
> Key: HDFS-12398
> URL: https://issues.apache.org/jira/browse/HDFS-12398
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Trivial
>
> The TestWriteReadStripedFile is basically doing the full product of file size 
> with data node failure or not. It's better to use JUnit Paramaterized test 
> suite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12388) A bad error message in DFSStripedOutputStream

2017-09-04 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12388:

Attachment: HDFS-12388.001.patch

> A bad error message in DFSStripedOutputStream
> -
>
> Key: HDFS-12388
> URL: https://issues.apache.org/jira/browse/HDFS-12388
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
> Attachments: HDFS-12388.001.patch
>
>
> Noticed a failure reported by Jenkins in HDFS-11882. The reported error 
> message wasn't correct, it should be: {{the number of failed blocks = 4 > the 
> number of data blocks = 3}} =>  {{the number of failed blocks = 4 > the 
> number of parity blocks = 3}} 
> {noformat}
> Regression
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030.testBlockTokenExpired
> Failing for the past 1 build (Since Failed#20973 )
> Took 6.4 sec.
> Error Message
> Failed at i=6294527
> Stacktrace
> java.io.IOException: Failed at i=6294527
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:559)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: java.io.IOException: Failed: the number of failed blocks = 4 > the 
> number of data blocks = 3
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamers(DFSStripedOutputStream.java:392)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.handleStreamerFailure(DFSStripedOutputStream.java:410)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.flushAllInternals(DFSStripedOutputStream.java:1262)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:627)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:563)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:79)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>   at java.io.DataOutputStream.write(DataOutputStream.java:88)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:557)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: 

[jira] [Updated] (HDFS-12388) A bad error message in DFSStripedOutputStream

2017-09-04 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12388:

Status: Patch Available  (was: Open)

> A bad error message in DFSStripedOutputStream
> -
>
> Key: HDFS-12388
> URL: https://issues.apache.org/jira/browse/HDFS-12388
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
> Attachments: HDFS-12388.001.patch
>
>
> Noticed a failure reported by Jenkins in HDFS-11882. The reported error 
> message wasn't correct, it should be: {{the number of failed blocks = 4 > the 
> number of data blocks = 3}} =>  {{the number of failed blocks = 4 > the 
> number of parity blocks = 3}} 
> {noformat}
> Regression
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030.testBlockTokenExpired
> Failing for the past 1 build (Since Failed#20973 )
> Took 6.4 sec.
> Error Message
> Failed at i=6294527
> Stacktrace
> java.io.IOException: Failed at i=6294527
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:559)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: java.io.IOException: Failed: the number of failed blocks = 4 > the 
> number of data blocks = 3
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamers(DFSStripedOutputStream.java:392)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.handleStreamerFailure(DFSStripedOutputStream.java:410)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.flushAllInternals(DFSStripedOutputStream.java:1262)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:627)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:563)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:79)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>   at java.io.DataOutputStream.write(DataOutputStream.java:88)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:557)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: 

[jira] [Assigned] (HDFS-12388) A bad error message in DFSStripedOutputStream

2017-09-04 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-12388:
---

Assignee: Huafeng Wang

> A bad error message in DFSStripedOutputStream
> -
>
> Key: HDFS-12388
> URL: https://issues.apache.org/jira/browse/HDFS-12388
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Huafeng Wang
>
> Noticed a failure reported by Jenkins in HDFS-11882. The reported error 
> message wasn't correct, it should be: {{the number of failed blocks = 4 > the 
> number of data blocks = 3}} =>  {{the number of failed blocks = 4 > the 
> number of parity blocks = 3}} 
> {noformat}
> Regression
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030.testBlockTokenExpired
> Failing for the past 1 build (Since Failed#20973 )
> Took 6.4 sec.
> Error Message
> Failed at i=6294527
> Stacktrace
> java.io.IOException: Failed at i=6294527
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:559)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: java.io.IOException: Failed: the number of failed blocks = 4 > the 
> number of data blocks = 3
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamers(DFSStripedOutputStream.java:392)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.handleStreamerFailure(DFSStripedOutputStream.java:410)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.flushAllInternals(DFSStripedOutputStream.java:1262)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.checkStreamerFailures(DFSStripedOutputStream.java:627)
>   at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:563)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:79)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>   at java.io.DataOutputStream.write(DataOutputStream.java:88)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.write(TestDFSStripedOutputStreamWithFailure.java:557)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.runTest(TestDFSStripedOutputStreamWithFailure.java:534)
>   at 
> org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure.testBlockTokenExpired(TestDFSStripedOutputStreamWithFailure.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-09-04 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Attachment: HDFS-1.003.patch

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-09-04 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Status: Patch Available  (was: Open)

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch, 
> HDFS-1.003.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-11066) Improve test coverage for ISA-L native coder

2017-08-30 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-11066:
---

Assignee: Huafeng Wang

> Improve test coverage for ISA-L native coder
> 
>
> Key: HDFS-11066
> URL: https://issues.apache.org/jira/browse/HDFS-11066
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> Some issues were introduced but not found in time due to lack of necessary 
> Jenkins support for the ISA-L related building options. We should re-enable 
> ISA-L related building options in Jenkins system, so to ensure the quality of 
> the related native codes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-27 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143291#comment-16143291
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi [~andrew.wang], any comment on my latest update?

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-22 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136375#comment-16136375
 ] 

Huafeng Wang commented on HDFS-1:
-

I just tweaked the patch according to your suggestions. Is it on the right way? 
And about the new API that returns both data and parity blocks, I tend to place 
this API in DFSClient and DistributedFileSystem, something like 
{code}
public ErasureCodedBlockLocation getECBlockLocation(Path p);
{code}

Is it a proper way to do that?

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-08-22 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Attachment: HDFS-1.002.patch

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch, HDFS-1.002.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-17 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130050#comment-16130050
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi guys, I just uploaded an initial patch which only sketches the basic idea. 
In the current implementation, the LocatedFileStatus that FIF fetched is 
transformed from HdfsLocatedFileStatus if the underlying file system is HDFS. 
And the BlockLocation is actually a block group in the erasure coding case. 
In my first patch, I added an ErasureCodedBlockLocation into LocatedFileStatus 
and this property will be set if HdfsLocatedFileStatus is erasure coded.

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12222) Add EC information to BlockLocation

2017-08-17 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-1:

Attachment: HDFS-1.001.patch

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-1.001.patch
>
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12269) Better to return a Map rather than HashMap in getErasureCodingCodecs

2017-08-16 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128582#comment-16128582
 ] 

Huafeng Wang commented on HDFS-12269:
-

Hi [~ajisakaa], thanks for your review. I just uploaded a new patch.

> Better to return a Map rather than HashMap in getErasureCodingCodecs
> 
>
> Key: HDFS-12269
> URL: https://issues.apache.org/jira/browse/HDFS-12269
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12269.001.patch, HDFS-12269.002.patch, 
> HDFS-12269.003.patch
>
>
> Currently the getErasureCodingCodecs function defined in ClientProtocal 
> returns a Hashmap:
> {code:java}
>   HashMap getErasureCodingCodecs() throws IOException;
> {code}
> It's better to return a Map.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12269) Better to return a Map rather than HashMap in getErasureCodingCodecs

2017-08-16 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12269:

Attachment: HDFS-12269.003.patch

> Better to return a Map rather than HashMap in getErasureCodingCodecs
> 
>
> Key: HDFS-12269
> URL: https://issues.apache.org/jira/browse/HDFS-12269
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12269.001.patch, HDFS-12269.002.patch, 
> HDFS-12269.003.patch
>
>
> Currently the getErasureCodingCodecs function defined in ClientProtocal 
> returns a Hashmap:
> {code:java}
>   HashMap getErasureCodingCodecs() throws IOException;
> {code}
> It's better to return a Map.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12269) Better to return a Map rather than HashMap in getErasureCodingCodecs

2017-08-15 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12269:

Attachment: HDFS-12269.002.patch

> Better to return a Map rather than HashMap in getErasureCodingCodecs
> 
>
> Key: HDFS-12269
> URL: https://issues.apache.org/jira/browse/HDFS-12269
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12269.001.patch, HDFS-12269.002.patch
>
>
> Currently the getErasureCodingCodecs function defined in ClientProtocal 
> returns a Hashmap:
> {code:java}
>   HashMap getErasureCodingCodecs() throws IOException;
> {code}
> It's better to return a Map.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12269) Better to return a Map rather than HashMap in getErasureCodingCodecs

2017-08-14 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12269:

Status: Patch Available  (was: Open)

> Better to return a Map rather than HashMap in getErasureCodingCodecs
> 
>
> Key: HDFS-12269
> URL: https://issues.apache.org/jira/browse/HDFS-12269
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12269.001.patch
>
>
> Currently the getErasureCodingCodecs function defined in ClientProtocal 
> returns a Hashmap:
> {code:java}
>   HashMap getErasureCodingCodecs() throws IOException;
> {code}
> It's better to return a Map.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12269) Better to return a Map rather than HashMap in getErasureCodingCodecs

2017-08-14 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12269:

Attachment: HDFS-12269.001.patch

> Better to return a Map rather than HashMap in getErasureCodingCodecs
> 
>
> Key: HDFS-12269
> URL: https://issues.apache.org/jira/browse/HDFS-12269
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Huafeng Wang
>Assignee: Huafeng Wang
>Priority: Minor
> Attachments: HDFS-12269.001.patch
>
>
> Currently the getErasureCodingCodecs function defined in ClientProtocal 
> returns a Hashmap:
> {code:java}
>   HashMap getErasureCodingCodecs() throws IOException;
> {code}
> It's better to return a Map.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-10 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122627#comment-16122627
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi [~drankye],  you're right. I think it's a better way and I'll give it a try.

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-10 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121223#comment-16121223
 ] 

Huafeng Wang commented on HDFS-1:
-

I've checked the related code and found it is not easy to provide other 
functions to get parity or data blocks.
The problem is, LocatedFileStatus is a subclass of FileStatus, both located in 
the hadoop-common module, which does not have file related erasure coding 
policy information. Without that specific policy information, LocatedFileStatus 
has no idea which BlockLocation is actually a parity block. 

After discussed with Kai offline, one approach is to add an ECSchema into 
LocatedFileStatus so that we can determine which blocks are parity blocks if 
erasure coding is enabled. 
Any suggestions here? Thanks.

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12036) Add audit log for getErasureCodingPolicy, getErasureCodingPolicies, getErasureCodingCodecs

2017-08-07 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116381#comment-16116381
 ] 

Huafeng Wang commented on HDFS-12036:
-

Hi Kai, the function is defined in ClientProtocol so I think it should be fixed 
in another issue. I just created one: 
https://issues.apache.org/jira/browse/HDFS-12269

> Add audit log for getErasureCodingPolicy, getErasureCodingPolicies, 
> getErasureCodingCodecs
> --
>
> Key: HDFS-12036
> URL: https://issues.apache.org/jira/browse/HDFS-12036
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12036.001.patch, HDFS-12036.002.patch, 
> HDFS-12036.003.patch
>
>
> These three FSNameSystem operations do not yet record audit logs. I am not 
> sure how useful these audit logs would be, but thought I should file them so 
> that they don't get dropped if they turn out to be needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12269) Better to return a Map rather than HashMap in getErasureCodingCodecs

2017-08-07 Thread Huafeng Wang (JIRA)
Huafeng Wang created HDFS-12269:
---

 Summary: Better to return a Map rather than HashMap in 
getErasureCodingCodecs
 Key: HDFS-12269
 URL: https://issues.apache.org/jira/browse/HDFS-12269
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Reporter: Huafeng Wang
Assignee: Huafeng Wang
Priority: Minor


Currently the getErasureCodingCodecs function defined in ClientProtocal returns 
a Hashmap:
{code:java}
  HashMap getErasureCodingCodecs() throws IOException;
{code}
It's better to return a Map.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12036) Add audit log for getErasureCodingPolicy, getErasureCodingPolicies, getErasureCodingCodecs

2017-08-07 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116190#comment-16116190
 ] 

Huafeng Wang commented on HDFS-12036:
-

Thanks [~drankye] for your review, I just updated the patch.

> Add audit log for getErasureCodingPolicy, getErasureCodingPolicies, 
> getErasureCodingCodecs
> --
>
> Key: HDFS-12036
> URL: https://issues.apache.org/jira/browse/HDFS-12036
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12036.001.patch, HDFS-12036.002.patch, 
> HDFS-12036.003.patch
>
>
> These three FSNameSystem operations do not yet record audit logs. I am not 
> sure how useful these audit logs would be, but thought I should file them so 
> that they don't get dropped if they turn out to be needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12036) Add audit log for getErasureCodingPolicy, getErasureCodingPolicies, getErasureCodingCodecs

2017-08-07 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang updated HDFS-12036:

Attachment: HDFS-12036.003.patch

> Add audit log for getErasureCodingPolicy, getErasureCodingPolicies, 
> getErasureCodingCodecs
> --
>
> Key: HDFS-12036
> URL: https://issues.apache.org/jira/browse/HDFS-12036
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12036.001.patch, HDFS-12036.002.patch, 
> HDFS-12036.003.patch
>
>
> These three FSNameSystem operations do not yet record audit logs. I am not 
> sure how useful these audit logs would be, but thought I should file them so 
> that they don't get dropped if they turn out to be needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12222) Add EC information to BlockLocation

2017-08-02 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110440#comment-16110440
 ] 

Huafeng Wang commented on HDFS-1:
-

Hi guys, I'd like to take this one.

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12222) Add EC information to BlockLocation

2017-08-02 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang reassigned HDFS-1:
---

Assignee: Huafeng Wang

> Add EC information to BlockLocation
> ---
>
> Key: HDFS-1
> URL: https://issues.apache.org/jira/browse/HDFS-1
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: hdfs-ec-3.0-nice-to-have
>
> HDFS applications query block location information to compute splits. One 
> example of this is FileInputFormat:
> https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346
> You see bits of code like this that calculate offsets as follows:
> {noformat}
> long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
>   blkLocations[startIndex].getLength() - offset;
> {noformat}
> EC confuses this since the block locations include parity block locations as 
> well, which are not part of the logical file length. This messes up the 
> offset calculation and thus topology/caching information too.
> Applications can figure out what's a parity block by reading the EC policy 
> and then parsing the schema, but it'd be a lot better if we exposed this more 
> generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >