[jira] [Updated] (HDFS-15488) Add a command to list all snapshots for a snaphottable root with snapshot Ids

2020-07-27 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-15488:
---
Attachment: HDFS-15488.000.patch

> Add a command to list all snapshots for a snaphottable root with snapshot Ids
> -
>
> Key: HDFS-15488
> URL: https://issues.apache.org/jira/browse/HDFS-15488
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: snapshots
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDFS-15488.000.patch
>
>
> Currently, the way to list snapshots is do a ls on  
> /.snapshot directory. Since creation time is not 
> recorded , there is no way to actually figure out the chronological order of 
> snapshots. The idea here is to add a command to list snapshots for a 
> snapshottable directory along with snapshot Ids which grow monotonically as 
> snapshots are created in the system. With snapID, it will be helpful to 
> figure out the chronology of snapshots in the system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15496) Add UI for deleted snapshots

2020-07-27 Thread Mukul Kumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166141#comment-17166141
 ] 

Mukul Kumar Singh commented on HDFS-15496:
--

cc [~jnp] [~shashikant] [~szetszwo]

> Add UI for deleted snapshots
> 
>
> Key: HDFS-15496
> URL: https://issues.apache.org/jira/browse/HDFS-15496
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Mukul Kumar Singh
>Priority: Major
>
> Add UI for deleted snapshots
> a) Show the list of snapshots per snapshottable directory
> b) Add deleted status in the JMX output for the Snapshot along with a snap ID
> e) NN UI, should sort the snapshots for snapIds. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15496) Add UI for deleted snapshots

2020-07-27 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-15496:
-
Description: 
Add UI for deleted snapshots

a) Show the list of snapshots per snapshottable directory
b) Add deleted status in the JMX output for the Snapshot along with a snap ID
e) NN UI, should sort the snapshots for snapIds. 

  was:
Add a 

a) Show the list of snapshots per snapshottable directory
b) Add deleted status in the JMX output for the Snapshot along with a snap ID
e) NN UI, should sort the snapshots for snapIds. 


> Add UI for deleted snapshots
> 
>
> Key: HDFS-15496
> URL: https://issues.apache.org/jira/browse/HDFS-15496
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Mukul Kumar Singh
>Priority: Major
>
> Add UI for deleted snapshots
> a) Show the list of snapshots per snapshottable directory
> b) Add deleted status in the JMX output for the Snapshot along with a snap ID
> e) NN UI, should sort the snapshots for snapIds. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15496) Add UI for deleted snapshots

2020-07-27 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created HDFS-15496:


 Summary: Add UI for deleted snapshots
 Key: HDFS-15496
 URL: https://issues.apache.org/jira/browse/HDFS-15496
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Mukul Kumar Singh


Add a 

a) Show the list of snapshots per snapshottable directory
b) Add deleted status in the JMX output for the Snapshot along with a snap ID
e) NN UI, should sort the snapshots for snapIds. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166075#comment-17166075
 ] 

Yang Yun commented on HDFS-15484:
-

Thanks [~ste...@apache.org] for the review.

I'll add webhdfs support and more tests.  I'll move the pacth to being a github 
PR  after new change.

> Add option in enum Rename to suport batch rename
> 
>
> Key: HDFS-15484
> URL: https://issues.apache.org/jira/browse/HDFS-15484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, performance
>Affects Versions: 3.3.0
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15484.001.patch, HDFS-15484.new_method.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-27 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15098:

Attachment: HDFS-15098.009.patch
Status: Patch Available  (was: Open)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-27 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15098:

Attachment: (was: HDFS-15098.009.patch)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-27 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15098:

Status: Open  (was: Patch Available)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15495) Decommissioning a DataNode with corrupted EC files should not be blocked indefinitely

2020-07-27 Thread Siyao Meng (Jira)
Siyao Meng created HDFS-15495:
-

 Summary: Decommissioning a DataNode with corrupted EC files should 
not be blocked indefinitely
 Key: HDFS-15495
 URL: https://issues.apache.org/jira/browse/HDFS-15495
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: block placement, ec
Affects Versions: 3.0.0
Reporter: Siyao Meng
Assignee: Siyao Meng


Originally discovered in patched CDH 6.2.1 (with a bunch of EC fixes: 
HDFS-14699, HDFS-14849, HDFS-14847, HDFS-14920, HDFS-14768, HDFS-14946, 
HDFS-15186).

When there's an EC file marked as corrupted on NN, if the admin tries to 
decommission a DataNode having one of the remaining blocks of the corrupted EC 
file, *the decom will never finish* unless the file is recovered by putting the 
missing blocks back in:

{code:title=The endless DatanodeAdminManager check loop, every 30s}
2020-07-23 16:36:12,805 TRACE blockmanagement.DatanodeAdminManager: Processed 0 
blocks so far this tick
2020-07-23 16:36:12,806 DEBUG blockmanagement.DatanodeAdminManager: Processing 
Decommission In Progress node 127.0.1.7:5007
2020-07-23 16:36:12,806 TRACE blockmanagement.DatanodeAdminManager: Block 
blk_-9223372036854775728_1013 numExpected=9, numLive=4
2020-07-23 16:36:12,806 INFO BlockStateChange: Block: 
blk_-9223372036854775728_1013, Expected Replicas: 9, live replicas: 4, corrupt 
replicas: 0, decommissioned replicas: 0, decommissioning replicas: 1, 
maintenance replicas: 0, live entering maintenance replicas: 0, excess 
replicas: 0, Is Open File: false, Datanodes having this block: 127.0.1.12:5012 
127.0.1.10:5010 127.0.1.8:5008 127.0.1.11:5011 127.0.1.7:5007 , Current 
Datanode: 127.0.1.7:5007, Is current datanode decommissioning: true, Is current 
datanode entering maintenance: false
2020-07-23 16:36:12,806 DEBUG blockmanagement.DatanodeAdminManager: Node 
127.0.1.7:5007 still has 1 blocks to replicate before it is a candidate to 
finish Decommission In Progress.
2020-07-23 16:36:12,806 INFO blockmanagement.DatanodeAdminManager: Checked 1 
blocks and 1 nodes this tick
{code}

"Corrupted" file here meaning the EC file doesn't have enough EC blocks in the 
block group to be reconstructed. e.g. for {{RS-6-3-1024k}}, when there are less 
than 6 blocks for an EC file, the file can no longer be retrieved correctly.

Will check on trunk as well soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-15492) Make trash root inside each snapshottable directory

2020-07-27 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-15492 started by Siyao Meng.
-
> Make trash root inside each snapshottable directory
> ---
>
> Key: HDFS-15492
> URL: https://issues.apache.org/jira/browse/HDFS-15492
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, hdfs-client
>Affects Versions: 3.2.1
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> We have seen FSImage corruption cases (e.g. HDFS-13101) where files inside 
> one snapshottable directories are moved outside of it. The most common case 
> of this is when trash is enabled and user deletes some file via the command 
> line without skipTrash.
> This jira aims to make a trash root for each snapshottable directory, same as 
> how encryption zone behaves at the moment.
> This will make trash cleanup a little bit more expensive on the NameNode as 
> it will be to iterate all trash roots. But should be fine as long as there 
> aren't many snapshottable directories.
> I could make this improvement as an option and disable it by default if 
> needed, such as {{dfs.namenode.snapshot.trashroot.enable}}
> One small caveat though, when disabling (disallowing) snapshot on the 
> snapshottable directory when this improvement is in place. The client should 
> merge the snapshottable directory's trash with that user's trash to ensure 
> proper trash cleanup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.

2020-07-27 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165980#comment-17165980
 ] 

Stephen O'Donnell commented on HDFS-15493:
--

I tested this change with at image of 9GB, 86M inodes and 74M blocks.

My load time with parallel loading off and this new async loading off, is about 
384 seconds.

Turning on only the new async block map loading, the load time is reduced to 
about 337 seconds.

With parallel loading on - 4 threads and 12 sub-sections, any the async block 
map off, the load time is about 236 seconds.

Finally turning on parallel loading and async block map, the load time 
increased to about 245 seconds.

Therefore on my tests, this change slows down the parallel load slightly, but 
it does provide about 13% speed up with serial loading.

When you tested, are you sure the parallel loading in HDFS-14617 was enabled 
correctly, by first saving the image to create the sub-sections in the image 
index? If it is working correctly, you should see log messages like:

{code}
2020-07-27 20:21:06,566 INFO namenode.FSImageFormatProtobuf: The fsimage will 
be loaded in parallel using 4 threads
2020-07-27 20:21:06,611 INFO namenode.FSImageFormatPBINode: Loading the INode 
section in parallel with 12 sub-sections
2020-07-27 20:21:06,613 INFO namenode.FSImageFormatPBINode: Loading 86398618 
INodes.
2020-07-27 20:21:10,855 INFO util.JvmPauseMonitor: Detected pause in JVM or 
host machine (eg GC): pause of approximately 3674ms
GC pool 'ParNew' had collection(s): count=1 time=4150ms
2020-07-27 20:22:49,827 INFO namenode.FSImageFormatPBINode: Completed loading 
all INode sections. Loaded 86398618 inodes.
2020-07-27 20:22:51,141 INFO namenode.FSImageFormatPBINode: Loading the 
INodeDirectory section in parallel with 12 sub-sections
2020-07-27 20:23:23,373 INFO namenode.FSImageFormatPBINode: Completed loading 
all INodeDirectory sub-sections
{code}

It would be very interesting to check the performance of my earlier suggestion 
with two single threaded executors and see how it performs.

> Update block map and name cache in parallel while loading fsimage.
> --
>
> Key: HDFS-15493
> URL: https://issues.apache.org/jira/browse/HDFS-15493
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chengwei Wang
>Priority: Major
> Attachments: HDFS-15493.001.patch
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and 
> block map after added inode file to inode directory. It would reduce time 
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load 
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost 
> reduc to 410s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165917#comment-17165917
 ] 

Steve Loughran commented on HDFS-15484:
---

OK. I'm less worried now it's HDFS only. But you do now have to worry about
hdfs and webhdfs support, and tests for all those troublespots we've discussed 
(renames in a loop, etc)


1. the new method should be in an interface which the hdfs and webhdfs clients 
can add. 
2. have its javadocs define what the api does
3. + test all that in a test suite that both webhdfs and hdfs run against.

As far as implementation goes, the HDFS team will need to look at that detail.

Could you move the patch to being a github PR -it's where we are reviewing all 
new patches.

Thanks,

> Add option in enum Rename to suport batch rename
> 
>
> Key: HDFS-15484
> URL: https://issues.apache.org/jira/browse/HDFS-15484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, performance
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15484.001.patch, HDFS-15484.new_method.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-15484:
--
Affects Version/s: 3.3.0

> Add option in enum Rename to suport batch rename
> 
>
> Key: HDFS-15484
> URL: https://issues.apache.org/jira/browse/HDFS-15484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, performance
>Affects Versions: 3.3.0
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15484.001.patch, HDFS-15484.new_method.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-27 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165911#comment-17165911
 ] 

Ravuri Sushma sree commented on HDFS-15229:
---

[~brahma] ,

Can you please check this and review the patch attached

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15465) Support WebHDFS accesses to the data stored in secure Datanode through insecure Namenode

2020-07-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165858#comment-17165858
 ] 

Hudson commented on HDFS-15465:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18475 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18475/])
HDFS-15465. Support WebHDFS accesses to the data stored in secure (github: rev 
026dce5334bca3b0aa9b05a6debe72db1e01842e)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestParameterParser.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java


> Support WebHDFS accesses to the data stored in secure Datanode through 
> insecure Namenode
> 
>
> Key: HDFS-15465
> URL: https://issues.apache.org/jira/browse/HDFS-15465
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: federation, webhdfs
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: webhdfs-federation.pdf
>
>
> We're federating a secure HDFS cluster with an insecure cluster.
> Using HDFS RPC, we can access the data managed by insecure Namenode and 
> stored in secure Datanode.
> However, it does not work for WebHDFS due to HadoopIllegalArgumentException.
> {code}
> $ curl -i "http://:/webhdfs/v1/?op=OPEN"
> HTTP/1.1 307 TEMPORARY_REDIRECT
> (omitted)
> Location: 
> http://:/webhdfs/v1/?op=OPEN==0
> $ curl -i 
> "http://:/webhdfs/v1/?op=OPEN==0"
> HTTP/1.1 400 Bad Request
> (omitted)
> {"RemoteException":{"exception":"HadoopIllegalArgumentException","javaClassName":"org.apache.hadoop.HadoopIllegalArgumentException","message":"Invalid
>  argument, newValue is null"}}
> {code}
> This is because secure Datanode expects a delegation token, but insecure 
> Namenode does not return it to a client.
> - org.apache.hadoop.security.token.Token.decodeWritable
> {code}
>   private static void decodeWritable(Writable obj,
>  String newValue) throws IOException {
> if (newValue == null) {
>   throw new HadoopIllegalArgumentException(
>   "Invalid argument, newValue is null");
> }
> {code}
> The issue proposes to support the access also for WebHDFS.
> The attached PDF file [^webhdfs-federation.pdf] depicts our current 
> architecture and proposal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15465) Support WebHDFS accesses to the data stored in secure Datanode through insecure Namenode

2020-07-27 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved HDFS-15465.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Support WebHDFS accesses to the data stored in secure Datanode through 
> insecure Namenode
> 
>
> Key: HDFS-15465
> URL: https://issues.apache.org/jira/browse/HDFS-15465
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: federation, webhdfs
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: webhdfs-federation.pdf
>
>
> We're federating a secure HDFS cluster with an insecure cluster.
> Using HDFS RPC, we can access the data managed by insecure Namenode and 
> stored in secure Datanode.
> However, it does not work for WebHDFS due to HadoopIllegalArgumentException.
> {code}
> $ curl -i "http://:/webhdfs/v1/?op=OPEN"
> HTTP/1.1 307 TEMPORARY_REDIRECT
> (omitted)
> Location: 
> http://:/webhdfs/v1/?op=OPEN==0
> $ curl -i 
> "http://:/webhdfs/v1/?op=OPEN==0"
> HTTP/1.1 400 Bad Request
> (omitted)
> {"RemoteException":{"exception":"HadoopIllegalArgumentException","javaClassName":"org.apache.hadoop.HadoopIllegalArgumentException","message":"Invalid
>  argument, newValue is null"}}
> {code}
> This is because secure Datanode expects a delegation token, but insecure 
> Namenode does not return it to a client.
> - org.apache.hadoop.security.token.Token.decodeWritable
> {code}
>   private static void decodeWritable(Writable obj,
>  String newValue) throws IOException {
> if (newValue == null) {
>   throw new HadoopIllegalArgumentException(
>   "Invalid argument, newValue is null");
> }
> {code}
> The issue proposes to support the access also for WebHDFS.
> The attached PDF file [^webhdfs-federation.pdf] depicts our current 
> architecture and proposal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-27 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165817#comment-17165817
 ] 

Ravuri Sushma sree commented on HDFS-15494:
---

Attached he patch skipping on windows . Please Review 

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-27 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15494:
--
Attachment: HDFS-15494.001.patch

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-27 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15494:
-

 Summary: TestReplicaCachingGetSpaceUsed 
#testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows
 Key: HDFS-15494
 URL: https://issues.apache.org/jira/browse/HDFS-15494
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
Fails on Windows because when RBW should be renamed to Finalized, windows is 
not supporting .

This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165713#comment-17165713
 ] 

Hadoop QA commented on HDFS-15439:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
49s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m  
3s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}123m 46s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}203m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.TestFileChecksumCompositeCrc |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSInputStreamBlockLocations |
|   | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamUpdatePipeline |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestStripedFileAppend |
|   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader |
|   | hadoop.hdfs.server.datanode.TestBPOfferService 

[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.

2020-07-27 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165674#comment-17165674
 ] 

Stephen O'Donnell commented on HDFS-15493:
--

Hi [~smarthan]. Thanks for this patch. I think it is a good idea - I have some 
thoughts on things we should try, which might improve things further below.

Only one thread can update the Cache map at a time and one can update the Block 
Map due to locking. The calls to these methods already process a batch, so they 
can hold the lock for a relatively long time. With that in mind, I wonder if 
the default of 4 threads makes sense - only 2 can ever be active at any time, 
and I think it would be possible for all 4 threads to be attempting to update 
the cacheMap when none are updating the blockMap. That means 2 or 3 threads 
will always be blocked.

I think it would be would be worth testing two single threaded executor pools - 
one for the cacheMap and one for BlockMap and see if that performs the same or 
better - what do you think?

I am not sure if waiting only 1ms before failing would give enough time for the 
executor to complete pending tasks. It may be possible for there to be a lot of 
queued requests which take a few seconds to finish processing:

{code}
  if (blocksMapUpdateExecutor != null) {
blocksMapUpdateExecutor.shutdown();
Try {
  while (!blocksMapUpdateExecutor.isTerminated()) {
blocksMapUpdateExecutor.awaitTermination(1, TimeUnit.MILLISECONDS);
  }
} catch (InterruptedException e) {
  LOG.error("Interrupted waiting for blocksMap update threads.", e);
  throw new IOException(e);
}
  }
{code}

We could wait 5 seconds, and if there is a timeout, log a warning, and then 
wait again, perhaps 10 times before failing? This would also let us know if the 
load iNodeDirectory Section is having to wait on the new background tasks 
before the next stage can start.

I would like to avoid the changes in FSImageFormatProtobuf.loadInternal() and 
passing all the null values to `inodeLoader.loadINodeDirectorySection(...)` if 
we can. I understand those changes are needed to shutdown the new executor. 
Therefore, lets wait and see how two single threaded executors work, and 
whether we need to wait on the thread pool to shutdown as that may influence 
how we shutdown the executors.

If there is a delay in the threadpools shutting down, then we could consider 
moving the `blocksMapUpdateExecutor.shutdown()` call into a 
Loader.shutdownExecutors() method which we call after loading all sections. 
Don't make this change until we see the what happens with the other experiments 
above.

Can I also ask:

1. Did you try HDFS-13693 and did it make any further speed improvement?

2. Could you try my suggestion with two single threaded executors and see what 
difference it makes to the runtime?

3. Would you be able to run a test with HDFS-14617 disabled to give us an idea 
of how much HDFS-14617 improves things on its own?

> Update block map and name cache in parallel while loading fsimage.
> --
>
> Key: HDFS-15493
> URL: https://issues.apache.org/jira/browse/HDFS-15493
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chengwei Wang
>Priority: Major
> Attachments: HDFS-15493.001.patch
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and 
> block map after added inode file to inode directory. It would reduce time 
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load 
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost 
> reduc to 410s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-27 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15098:

Attachment: HDFS-15098.009.patch
Status: Patch Available  (was: Open)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-27 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15098:

Status: Open  (was: Patch Available)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-27 Thread liusheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liusheng updated HDFS-15098:

Attachment: (was: HDFS-15098.009.patch)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165633#comment-17165633
 ] 

Hadoop QA commented on HDFS-15484:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} prototool {color} | {color:blue}  0m  
0s{color} | {color:blue} prototool was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
20s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
28s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  3m 28s{color} | 
{color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 28s{color} 
| {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 58s{color} | {color:orange} hadoop-hdfs-project: The patch generated 12 new 
+ 394 unchanged - 0 fixed = 406 total (was 394) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
25s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 3 new 
+ 0 unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m  
1s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 4 new + 0 
unchanged - 0 fixed = 4 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
3s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}130m 18s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| 

[jira] [Commented] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-27 Thread AMC-team (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165538#comment-17165538
 ] 

AMC-team commented on HDFS-15439:
-

Sorry about the mis-operation, will upload a valid patch later

> Setting dfs.mover.retry.max.attempts to negative value will retry forever.
> --
>
> Key: HDFS-15439
> URL: https://issues.apache.org/jira/browse/HDFS-15439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15439.000.patch
>
>
> Configuration parameter "dfs.mover.retry.max.attempts" is to define the 
> maximum number of retries before the mover consider the move failed. There is 
> no checking code so this parameter can accept any int value.
> Theoratically, setting this value to <=0 should mean that no retry at all. 
> However, if you set the value to negative value. The checking condition for 
> retry failed will never satisfied because the if statement is "*if 
> (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
> retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
> {code:java}
> private Result processNamespace() throws IOException {
>   ... //wait for pending move to finish and retry the failed migration
>   if (hasFailed && !hasSuccess) {
> if (retryCount.get() == retryMaxAttempts) {
>   result.setRetryFailed();
>   LOG.error("Failed to move some block's after "
>   + retryMaxAttempts + " retries.");
>   return result;
> } else {
>   retryCount.incrementAndGet();
> }
>   } else {
> // Reset retry count if no failure.
> retryCount.set(0);
>   }
>   ...
> }
> {code}
> *How to fix*
> Add checking code of "dfs.mover.retry.max.attempts" to accept only 
> non-negative value or change the if statement condition when retry count 
> exceeds max attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-27 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15439:

Attachment: (was: HDFS-15439.001.patch)

> Setting dfs.mover.retry.max.attempts to negative value will retry forever.
> --
>
> Key: HDFS-15439
> URL: https://issues.apache.org/jira/browse/HDFS-15439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15439.000.patch
>
>
> Configuration parameter "dfs.mover.retry.max.attempts" is to define the 
> maximum number of retries before the mover consider the move failed. There is 
> no checking code so this parameter can accept any int value.
> Theoratically, setting this value to <=0 should mean that no retry at all. 
> However, if you set the value to negative value. The checking condition for 
> retry failed will never satisfied because the if statement is "*if 
> (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
> retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
> {code:java}
> private Result processNamespace() throws IOException {
>   ... //wait for pending move to finish and retry the failed migration
>   if (hasFailed && !hasSuccess) {
> if (retryCount.get() == retryMaxAttempts) {
>   result.setRetryFailed();
>   LOG.error("Failed to move some block's after "
>   + retryMaxAttempts + " retries.");
>   return result;
> } else {
>   retryCount.incrementAndGet();
> }
>   } else {
> // Reset retry count if no failure.
> retryCount.set(0);
>   }
>   ...
> }
> {code}
> *How to fix*
> Add checking code of "dfs.mover.retry.max.attempts" to accept only 
> non-negative value or change the if statement condition when retry count 
> exceeds max attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-15439) Setting dfs.mover.retry.max.attempts to negative value will retry forever.

2020-07-27 Thread AMC-team (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

AMC-team updated HDFS-15439:

Comment: was deleted

(was: Upload a patch based on [~ayushtkn]'s suggestion. Thanks!)

> Setting dfs.mover.retry.max.attempts to negative value will retry forever.
> --
>
> Key: HDFS-15439
> URL: https://issues.apache.org/jira/browse/HDFS-15439
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Reporter: AMC-team
>Priority: Major
> Attachments: HDFS-15439.000.patch
>
>
> Configuration parameter "dfs.mover.retry.max.attempts" is to define the 
> maximum number of retries before the mover consider the move failed. There is 
> no checking code so this parameter can accept any int value.
> Theoratically, setting this value to <=0 should mean that no retry at all. 
> However, if you set the value to negative value. The checking condition for 
> retry failed will never satisfied because the if statement is "*if 
> (retryCount.get() == retryMaxAttempts)*". The retry count will always +1 by 
> retryCount.incrementAndGet() after failed but never *=* *retryMaxAttempts.* 
> {code:java}
> private Result processNamespace() throws IOException {
>   ... //wait for pending move to finish and retry the failed migration
>   if (hasFailed && !hasSuccess) {
> if (retryCount.get() == retryMaxAttempts) {
>   result.setRetryFailed();
>   LOG.error("Failed to move some block's after "
>   + retryMaxAttempts + " retries.");
>   return result;
> } else {
>   retryCount.incrementAndGet();
> }
>   } else {
> // Reset retry count if no failure.
> retryCount.set(0);
>   }
>   ...
> }
> {code}
> *How to fix*
> Add checking code of "dfs.mover.retry.max.attempts" to accept only 
> non-negative value or change the if statement condition when retry count 
> exceeds max attempts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15493) Update block map and name cache in parallel while loading fsimage.

2020-07-27 Thread Chengwei Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165525#comment-17165525
 ] 

Chengwei Wang commented on HDFS-15493:
--

Hi [~sodonnell][~hexiaoqiao][~weichiu], can you help to review this patch?

> Update block map and name cache in parallel while loading fsimage.
> --
>
> Key: HDFS-15493
> URL: https://issues.apache.org/jira/browse/HDFS-15493
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chengwei Wang
>Priority: Major
> Attachments: HDFS-15493.001.patch
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and 
> block map after added inode file to inode directory. It would reduce time 
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load 
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost 
> reduc to 410s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165492#comment-17165492
 ] 

Yang Yun edited comment on HDFS-15484 at 7/27/20, 7:34 AM:
---

Update a new patch  HDFS-15484.new_method.patch.
 Add a new solution to implement new method 'batchRename' for batch rename.  
FYI.

 


was (Author: hadoop_yangyun):
Update a new patch  HDFS-15484.new_method.patch.
Add a new solution to impliment new method 'batchRename' for batch rename.  FYI.

 

> Add option in enum Rename to suport batch rename
> 
>
> Key: HDFS-15484
> URL: https://issues.apache.org/jira/browse/HDFS-15484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, performance
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15484.001.patch, HDFS-15484.new_method.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165492#comment-17165492
 ] 

Yang Yun commented on HDFS-15484:
-

Update a new patch  HDFS-15484.new_method.patch.
Add a new solution to impliment new method 'batchRename' for batch rename.  FYI.

 

> Add option in enum Rename to suport batch rename
> 
>
> Key: HDFS-15484
> URL: https://issues.apache.org/jira/browse/HDFS-15484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, performance
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15484.001.patch, HDFS-15484.new_method.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15484) Add option in enum Rename to suport batch rename

2020-07-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15484:

Attachment: HDFS-15484.new_method.patch

> Add option in enum Rename to suport batch rename
> 
>
> Key: HDFS-15484
> URL: https://issues.apache.org/jira/browse/HDFS-15484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient, namenode, performance
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15484.001.patch, HDFS-15484.new_method.patch
>
>
> Sometime we need rename many files after a task,  add a new option in enum 
> Rename to support batch rename, which only need one RPC and one lock. For 
> example,
> rename(new Path("/dir1/f1::/dir2/f2"), new Path("/dir3/f1::dir4/f4"), 
> Rename.BATCH)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org