[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-08-21 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181694#comment-17181694
 ] 

lindongdong commented on HDFS-15098:


[~seanlau] ok. I agree it.  No need add the unsed method. 

Pls go on~

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-08-18 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180242#comment-17180242
 ] 

lindongdong commented on HDFS-15098:


Hi [~seanlau], the pictrue can't be see. can you send it again?

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-08-17 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179292#comment-17179292
 ] 

lindongdong commented on HDFS-15098:


Hi, [~seanlau] thank you for your reply.

 

we met a issue about openssl compatible is that: we rolling upgarede fro 2.7.2 
to 3.1.1, when it is doing, we use distirubuted cache to make sure the MR job 
work ok. but we find that problem happens. because old jars will call the old 
native method, but there is not in new native library. So I strongly suggest 
keep the old method in new native code.

 

for 
[OpensslSecureRandom.c|https://github.com/apache/hadoop/pull/2211/files#diff-3ee504e8c2a27c840c39c4496a27cc02],
 I think it is done in https://issues.apache.org/jira/browse/HADOOP-16647, pls 
check it.

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-08-17 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178957#comment-17178957
 ] 

lindongdong commented on HDFS-15098:


[~seanlau], another doubt is why modify the OpensslSecureRandom.c ?

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-08-17 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178954#comment-17178954
 ] 

lindongdong commented on HDFS-15098:


[~seanlau]  the init and clean method should keep the old definition (means 
without engine parameter),otherwise it will take a compatible issue.

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-08-13 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176885#comment-17176885
 ] 

lindongdong commented on HDFS-15039:


OK, thanks~~

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.006.patch, HDFS-15039.patch, 
> HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15039) Cache meta file length of FinalizedReplica to reduce call File.length()

2020-08-13 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176827#comment-17176827
 ] 

lindongdong commented on HDFS-15039:


[~hadoop_yangyun], for the last patch, I have one doubt:

 

why the metaLength is int type? for saving the memory? but when get it, it 
return long value, there is a convert. is it bad to performance?

> Cache meta file length of FinalizedReplica to reduce call File.length()
> ---
>
> Key: HDFS-15039
> URL: https://issues.apache.org/jira/browse/HDFS-15039
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15039.006.patch, HDFS-15039.patch, 
> HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch, HDFS-15039.patch
>
>
> When use ReplicaCachingGetSpaceUsed to get the volume space used.  It will 
> call File.length() for every meta file of replica. That add more disk IO, we 
> found the slow log as below. For finalized replica, the size of meta file is 
> not changed, i think we can cache the value.
> {code:java}
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  Refresh dfs used, bpid: BP-898717543-10.75.1.240-1519386995727 replicas 
> size: 1166 dfsUsed: 72227113183 on volume: 
> DS-3add8d62-d69a-4f5a-a29f-b7bbb400af2e duration: 17206ms{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-24 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164267#comment-17164267
 ] 

lindongdong commented on HDFS-15098:


[~seanlau] , Hi, the latest patch also is not OK.

 

for this two old methods, pls handle them in native code:


private native long init(long context, int mode, int alg, int padding, private 
native long init(long context, int mode, int alg, int padding, byte[] key, 
byte[] iv); byte[] key, byte[] iv);private native void clean(long context);
 

also, add the method for the same reason:


private OpensslCipher(long context, int alg, int padding) {
 

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-07-24 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164185#comment-17164185
 ] 

lindongdong commented on HDFS-15098:


[~zZtai], hi, I find some code just for debug, may be remove them:
{code:java}
System.out.println("Now Codec is OpensslAesCtrCryptoCodec");{code}
 

and this one:
{code:java}
public void log(GeneralSecurityException e) {
 LOG.warn(e.getMessage());
 }{code}
 

for compatibility, I think it is better to keep the old method that is without 
engine:
{code:java}
private native long init(long context, int mode, int alg, int padding, private 
native long init(long context, int mode, int alg, int padding,        byte[] 
key, byte[] iv); byte[] key, byte[] iv, long engine);

private native void clean(long context, long engine);{code}
 

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: zZtai
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, 
> HDFS-15098.009.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.Configure Hadoop KMS
>  2.test HDFS sm4
>  hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
>  hdfs dfs -mkdir /benchmarks
>  hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
>  1.openssl version >=1.1.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-06-12 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134136#comment-17134136
 ] 

lindongdong commented on HDFS-15098:


[~seanlau] awesome! (y)

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: zZtai
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, 
> HDFS-15098.006.patch, HDFS-15098.007.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.download Bouncy Castle Crypto APIs from bouncycastle.org
> [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar]
> 2.Configure JDK
> Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory,
> add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" 
> to $JAVA_HOME/jre/lib/security/java.security file
> 3.Configure Hadoop KMS
> 4.test HDFS sm4
> hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
> hdfs dfs -mkdir /benchmarks
> hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
> 1.openssl version >=1.1.1
> 2.configure Bouncy Castle Crypto on JDK



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-06-09 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128921#comment-17128921
 ] 

lindongdong commented on HDFS-15098:


[~seanlau] get it~

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: zZtai
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.download Bouncy Castle Crypto APIs from bouncycastle.org
> [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar]
> 2.Configure JDK
> Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory,
> add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" 
> to $JAVA_HOME/jre/lib/security/java.security file
> 3.Configure Hadoop KMS
> 4.test HDFS sm4
> hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
> hdfs dfs -mkdir /benchmarks
> hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
> 1.openssl version >=1.1.1
> 2.configure Bouncy Castle Crypto on JDK



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-05-21 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112893#comment-17112893
 ] 

lindongdong commented on HDFS-15098:


Hi, [~zZtai] , why need the 2 steps:


1.download Bouncy Castle Crypto APIs from bouncycastle.org
[https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar]
2.Configure JDK
Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory,
add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" 
to $JAVA_HOME/jre/lib/security/java.security file

 

IMO, the openssl supports the SM4 fully, so we do't need the jar.

 

> Add SM4 encryption method for HDFS
> --
>
> Key: HDFS-15098
> URL: https://issues.apache.org/jira/browse/HDFS-15098
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.4.0
>Reporter: liusheng
>Assignee: zZtai
>Priority: Major
>  Labels: sm4
> Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, 
> HDFS-15098.003.patch, HDFS-15098.004.patch
>
>
> SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard 
> for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure).
>  SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far 
> been rejected by ISO. One of the reasons for the rejection has been 
> opposition to the WAPI fast-track proposal by the IEEE. please see:
> [https://en.wikipedia.org/wiki/SM4_(cipher)]
>  
> *Use sm4 on hdfs as follows:*
> 1.download Bouncy Castle Crypto APIs from bouncycastle.org
> [https://bouncycastle.org/download/bcprov-ext-jdk15on-165.jar]
> 2.Configure JDK
> Place bcprov-ext-jdk15on-165.jar in $JAVA_HOME/jre/lib/ext directory,
> add "security.provider.10=org.bouncycastle.jce.provider.BouncyCastleProvider" 
> to $JAVA_HOME/jre/lib/security/java.security file
> 3.Configure Hadoop KMS
> 4.test HDFS sm4
> hadoop key create key1 -cipher 'SM4/CTR/NoPadding'
> hdfs dfs -mkdir /benchmarks
> hdfs crypto -createZone -keyName key1 -path /benchmarks
> *requires:*
> 1.openssl version >=1.1.1
> 2.configure Bouncy Castle Crypto on JDK



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12348) disable removing blocks to trash while rolling upgrade

2019-11-28 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984287#comment-16984287
 ] 

lindongdong commented on HDFS-12348:


Hi, [~surendrasingh], thanks for your patch.

I find some problem with the patch: 

After we do rolling upgrade prepare, the old DN will move deleted file to 
trash. 

With this patch, the new DN will never delete the trash dir forever.

> disable removing blocks to trash while rolling upgrade
> --
>
> Key: HDFS-12348
> URL: https://issues.apache.org/jira/browse/HDFS-12348
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: HDFS-12348.001.patch, HDFS-12348.002.patch, 
> HDFS-12348.003.patch
>
>
> DataNode remove block file and meta file to trash while rolling upgrade,and 
> do delete when
> executing finalize. 
> This  leads disk of datanode to be full, because
> (1) frequently creating and deleting files(eg,Hbase compaction);
> (2) cluster is very big, and rolling upgrade often last several days;
> Current our solution is clean trash by hand, but this is very dangerous in 
> product environment. 
> we think disable trash of datanode maybe a good method to avoid disk to be 
> full.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-11-27 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16983435#comment-16983435
 ] 

lindongdong commented on HDFS-12733:


Hi, [~hexiaoqiao]  [~brahmareddy] [~ayushtkn]

I met many problems that edits in JN is missing, so NN starts failed. Every 
time, I copy the good edits from NN to JN to solve the problem.

If we disable the edits in NN, it can't save many space, but a solution to fix 
the cluster.

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch, HDFS-12733.007.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14308) DFSStripedInputStream curStripeBuf is not freed by unbuffer()

2019-10-22 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957528#comment-16957528
 ] 

lindongdong commented on HDFS-14308:


Hi, [~zhaoyim], Thanks for your work.

A suggestion for the latest patch: using "super.unbuffer()" rather than 
"closeCurrentBlockReaders()" is better

> DFSStripedInputStream curStripeBuf is not freed by unbuffer()
> -
>
> Key: HDFS-14308
> URL: https://issues.apache.org/jira/browse/HDFS-14308
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.0.0
>Reporter: Joe McDonnell
>Assignee: Zhao Yi Ming
>Priority: Major
> Attachments: ec_heap_dump.png
>
>
> Some users of HDFS cache opened HDFS file handles to avoid repeated 
> roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file 
> handles by default. Recent tests on erasure coded files show that the open 
> file handles can consume a large amount of memory when not in use.
> For example, here is output from Impala's JMX endpoint when 608 file handles 
> are cached
> {noformat}
> {
> "name": "java.nio:type=BufferPool,name=direct",
> "modelerType": "sun.management.ManagementFactoryHelper$1",
> "Name": "direct",
> "TotalCapacity": 1921048960,
> "MemoryUsed": 1921048961,
> "Count": 633,
> "ObjectName": "java.nio:type=BufferPool,name=direct"
> },{noformat}
> This shows direct buffer memory usage of 3MB per DFSStripedInputStream. 
> Attached is output from Eclipse MAT showing that the direct buffers come from 
> DFSStripedInputStream objects. Both Impala and HBase call unbuffer() when a 
> file handle is being cached and potentially unused for significant chunks of 
> time, yet this shows that the memory remains in use.
> To support caching file handles on erasure coded files, DFSStripedInputStream 
> should avoid holding buffers after the unbuffer() call. See HDFS-7694. 
> "unbuffer()" is intended to move an input stream to a lower memory state to 
> support these caching use cases. In particular, the curStripeBuf seems to be 
> allocated from the BUFFER_POOL on a resetCurStripeBuffer(true) call. It is 
> not freed until close().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14554) Avoid to process duplicate blockreport of datanode when namenode in startup safemode

2019-09-29 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940607#comment-16940607
 ] 

lindongdong commented on HDFS-14554:


Hi, [~hexiaoqiao], Thanks for your work.

I test your patch, a lot of blocks are not reported. As I analyze, when DN 
reports blocks one disk by one disk, this problem happens.

A boolean is not enough for this, maybe a Set is needed.

> Avoid to process duplicate blockreport of datanode when namenode in startup 
> safemode
> 
>
> Key: HDFS-14554
> URL: https://issues.apache.org/jira/browse/HDFS-14554
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-14554.001.patch
>
>
> When NameNode restart and enter stage of startup safemode, load of NameNode 
> could be very high since blockreport request storm from datanodes together. 
> Sometimes datanode may request blockreport times because NameNode doesn't 
> return in time and DataNode meet timeoutexception then retry, Thus it 
> sharpens load of NameNode.
> So we should check and filter duplicate blockreport request from one 
> datannode and reduce load of Namenode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-12733) Option to disable to namenode local edits

2019-06-06 Thread lindongdong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lindongdong updated HDFS-12733:
---
Comment: was deleted

(was: @[~shv] don't agree any more~)

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-05-31 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853543#comment-16853543
 ] 

lindongdong commented on HDFS-12733:


@[~shv] don't agree any more~

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-05-24 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848026#comment-16848026
 ] 

lindongdong commented on HDFS-12733:


[~hexiaoqiao] Hi, Pls think about the old editlog remains in the local disk if 
we disable the local edits.

When the NN restarts or disaster recovery, does it affect?

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-05-24 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847317#comment-16847317
 ] 

lindongdong commented on HDFS-12733:


Great job!

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon

2019-02-20 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772828#comment-16772828
 ] 

lindongdong commented on HDFS-14235:


why changes? the first patch is perfect:D:D:D:D:D:D

> Handle ArrayIndexOutOfBoundsException in 
> DataNodeDiskMetrics#slowDiskDetectionDaemon 
> -
>
> Key: HDFS-14235
> URL: https://issues.apache.org/jira/browse/HDFS-14235
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Surendra Singh Lilhore
>Assignee: Ranith Sardar
>Priority: Major
> Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, 
> HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png
>
>
> below code throwing exception because {{volumeIterator.next()}} called two 
> time without checking hashNext().
> {code:java}
> while (volumeIterator.hasNext()) {
>   FsVolumeSpi volume = volumeIterator.next();
>   DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics();
>   String volumeName = volume.getBaseURI().getPath();
>   metadataOpStats.put(volumeName,
>   metrics.getMetadataOperationMean());
>   readIoStats.put(volumeName, metrics.getReadIoMean());
>   writeIoStats.put(volumeName, metrics.getWriteIoMean());
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-13053) Track time to process packet in Datanode

2018-09-28 Thread lindongdong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lindongdong updated HDFS-13053:
---
Comment: was deleted

(was: Hi, [~brahmareddy], can you review the jira?  This is very useful~~ )

> Track time to process packet in Datanode
> 
>
> Key: HDFS-13053
> URL: https://issues.apache.org/jira/browse/HDFS-13053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Íñigo Goiri
>Assignee: Pulkit Misra
>Priority: Minor
> Attachments: HDFS-13053.000.patch
>
>
> We should track the time that each datanode takes to process a packet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13053) Track time to process packet in Datanode

2018-09-28 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631820#comment-16631820
 ] 

lindongdong commented on HDFS-13053:


Hi, [~brahmareddy], can you review the jira?  This is very useful~~ 

> Track time to process packet in Datanode
> 
>
> Key: HDFS-13053
> URL: https://issues.apache.org/jira/browse/HDFS-13053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Íñigo Goiri
>Assignee: Pulkit Misra
>Priority: Minor
> Attachments: HDFS-13053.000.patch
>
>
> We should track the time that each datanode takes to process a packet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2018-08-25 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592561#comment-16592561
 ] 

lindongdong commented on HDFS-13671:


Hi, [~kihwal], how is the revert work? 

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Priority: Major
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11112) Journal Nodes should refuse to format non-empty directories

2018-07-19 Thread lindongdong (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550192#comment-16550192
 ] 

lindongdong commented on HDFS-2:


[~vinayrpet], I totally agree with you. 
{panel:title=the opinion}
I agree that this will avoid accidental deletes.
But still it should be able to format on demand. i.e. When the answer to prompt 
is 'yes' during reformat. Similar to Namenode dir format.
Namenode calls the JournalNode format() only after confirmation in the prompt 
(or "-force") was mentioned.
Ideally, we should be passing on the confirmation of prompt via RPC to 
JournalNode as well.
{panel}


> Journal Nodes should refuse to format non-empty directories
> ---
>
> Key: HDFS-2
> URL: https://issues.apache.org/jira/browse/HDFS-2
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Yiqun Lin
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-2.001.patch, HDFS-2.002.patch
>
>
> Journal Nodes should reject the {{format}} RPC request if a storage directory 
> is non-empty. The relevant code is in {{JNStorage#format}}.
> {code}
>   void format(NamespaceInfo nsInfo) throws IOException {
> setStorageInfo(nsInfo);
> ...
> unlockAll();
> sd.clearDirectory();
> writeProperties(sd);
> createPaxosDir();
> analyzeStorage();
> {code}
> This would make the behavior similar to {{namenode -format -nonInteractive}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN

2018-02-11 Thread lindongdong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359766#comment-16359766
 ] 

lindongdong edited comment on HDFS-8693 at 2/12/18 3:59 AM:


I meet some errors about this patch.

If the cluster has 3 nodes: A, B, C, and the NNs is in A, B.

When we remove B, and install a new SNN in C, all DNs fail to register to the 
new SNN. Error like the below:
{code:java}
2018-02-09 19:49:02,728 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/x.x.x.x:25006 | Problem connecting to server: 
189-219-255-103/x.x.x.x:25006 | BPServiceActor.java:197
2018-02-09 19:49:07,731 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/x.x.x.x:25006 | Exception encountered while 
connecting to the server : javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)] | Client.java:726
{code}


was (Author: lindongdong):
I meet some errors about this patch.

If the cluster has 3 nodes: A, B, C, and the NNs is in A, B.

When we remove B, and install a new SNN in C, all DNs fail to register to the 
new SNN. Error like the below:
{code:java}
2018-02-09 19:49:02,728 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/10.219.255.103:25006 | Problem connecting to 
server: 189-219-255-103/10.219.255.103:25006 | BPServiceActor.java:197
2018-02-09 19:49:07,731 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/10.219.255.103:25006 | Exception encountered 
while connecting to the server : javax.security.sasl.SaslException: GSS 
initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)] | Client.java:726
{code}

> refreshNamenodes does not support adding a new standby to a running DN
> --
>
> Key: HDFS-8693
> URL: https://issues.apache.org/jira/browse/HDFS-8693
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ha
>Affects Versions: 2.6.0
>Reporter: Jian Fang
>Assignee: Ajith S
>Priority: Critical
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4
>
> Attachments: HDFS-8693.02.patch, HDFS-8693.03.patch, HDFS-8693.1.patch
>
>
> I tried to run the following command on a Hadoop 2.6.0 cluster with HA 
> support 
> $ hdfs dfsadmin -refreshNamenodes datanode-host:port
> to refresh name nodes on data nodes after I replaced one name node with a new 
> one so that I don't need to restart the data nodes. However, I got the 
> following error:
> refreshNamenodes: HA does not currently support adding a new standby to a 
> running DN. Please do a rolling restart of DNs to reconfigure the list of NNs.
> I checked the 2.6.0 code and the error was thrown by the following code 
> snippet, which led me to this JIRA.
> void refreshNNList(ArrayList addrs) throws IOException {
> Set oldAddrs = Sets.newHashSet();
> for (BPServiceActor actor : bpServices)
> { oldAddrs.add(actor.getNNSocketAddress()); }
> Set newAddrs = Sets.newHashSet(addrs);
> if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty())
> { // Keep things simple for now -- we can implement this at a later date. 
> throw new IOException( "HA does not currently support adding a new standby to 
> a running DN. " + "Please do a rolling restart of DNs to reconfigure the list 
> of NNs."); }
> }
> Looks like this the refreshNameNodes command is an uncompleted feature. 
> Unfortunately, the new name node on a replacement is critical for auto 
> provisioning a hadoop cluster with HDFS HA support. Without this support, the 
> HA feature could not really be used. I also observed that the new standby 
> name node on the replacement instance could stuck in safe mode because no 
> data nodes check in with it. Even with a rolling restart, it may take quite 
> some time to restart all data nodes if we have a big cluster, for example, 
> with 4000 data nodes, let alone restarting DN is way too intrusive and it is 
> not a preferable operation in production. It also increases the chance for a 
> double failure because the standby name node is not really ready for a 
> failover in the 

[jira] [Comment Edited] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN

2018-02-11 Thread lindongdong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359766#comment-16359766
 ] 

lindongdong edited comment on HDFS-8693 at 2/12/18 3:58 AM:


I meet some errors about this patch.

If the cluster has 3 nodes: A, B, C, and the NNs is in A, B.

When we remove B, and install a new SNN in C, all DNs fail to register to the 
new SNN. Error like the below:
{code:java}
2018-02-09 19:49:02,728 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/10.219.255.103:25006 | Problem connecting to 
server: 189-219-255-103/10.219.255.103:25006 | BPServiceActor.java:197
2018-02-09 19:49:07,731 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/10.219.255.103:25006 | Exception encountered 
while connecting to the server : javax.security.sasl.SaslException: GSS 
initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)] | Client.java:726
{code}


was (Author: lindongdong):
I meet some errors about this patch.

If the cluster has 3 nodes: A, B, C, and the NNs is in A, B.

When we remove B, and install a new SNN in C, all DNs fail to register to the 
new SNN. Error like the below:
{code:java}
2018-02-09 19:49:02,728 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/189.219.255.103:25006 | Problem connecting to 
server: 189-219-255-103/189.219.255.103:25006 | BPServiceActor.java:197
2018-02-09 19:49:07,731 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/189.219.255.103:25006 | Exception encountered 
while connecting to the server : javax.security.sasl.SaslException: GSS 
initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)] | Client.java:726
{code}

> refreshNamenodes does not support adding a new standby to a running DN
> --
>
> Key: HDFS-8693
> URL: https://issues.apache.org/jira/browse/HDFS-8693
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ha
>Affects Versions: 2.6.0
>Reporter: Jian Fang
>Assignee: Ajith S
>Priority: Critical
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4
>
> Attachments: HDFS-8693.02.patch, HDFS-8693.03.patch, HDFS-8693.1.patch
>
>
> I tried to run the following command on a Hadoop 2.6.0 cluster with HA 
> support 
> $ hdfs dfsadmin -refreshNamenodes datanode-host:port
> to refresh name nodes on data nodes after I replaced one name node with a new 
> one so that I don't need to restart the data nodes. However, I got the 
> following error:
> refreshNamenodes: HA does not currently support adding a new standby to a 
> running DN. Please do a rolling restart of DNs to reconfigure the list of NNs.
> I checked the 2.6.0 code and the error was thrown by the following code 
> snippet, which led me to this JIRA.
> void refreshNNList(ArrayList addrs) throws IOException {
> Set oldAddrs = Sets.newHashSet();
> for (BPServiceActor actor : bpServices)
> { oldAddrs.add(actor.getNNSocketAddress()); }
> Set newAddrs = Sets.newHashSet(addrs);
> if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty())
> { // Keep things simple for now -- we can implement this at a later date. 
> throw new IOException( "HA does not currently support adding a new standby to 
> a running DN. " + "Please do a rolling restart of DNs to reconfigure the list 
> of NNs."); }
> }
> Looks like this the refreshNameNodes command is an uncompleted feature. 
> Unfortunately, the new name node on a replacement is critical for auto 
> provisioning a hadoop cluster with HDFS HA support. Without this support, the 
> HA feature could not really be used. I also observed that the new standby 
> name node on the replacement instance could stuck in safe mode because no 
> data nodes check in with it. Even with a rolling restart, it may take quite 
> some time to restart all data nodes if we have a big cluster, for example, 
> with 4000 data nodes, let alone restarting DN is way too intrusive and it is 
> not a preferable operation in production. It also increases the chance for a 
> double failure because the standby name node is not really ready 

[jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN

2018-02-10 Thread lindongdong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359766#comment-16359766
 ] 

lindongdong commented on HDFS-8693:
---

I meet some errors about this patch.

If the cluster has 3 nodes: A, B, C, and the NNs is in A, B.

When we remove B, and install a new SNN in C, all DNs fail to register to the 
new SNN. Error like the below:
{code:java}
2018-02-09 19:49:02,728 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/189.219.255.103:25006 | Problem connecting to 
server: 189-219-255-103/189.219.255.103:25006 | BPServiceActor.java:197
2018-02-09 19:49:07,731 | WARN | DataNode: 
[[[DISK]file:/_1/b-b_2/bb_3/b_4/b-5/B-2/B-3/B-4/bbb-b/hadoop/data1/dn/]]
 heartbeating to 189-219-255-103/189.219.255.103:25006 | Exception encountered 
while connecting to the server : javax.security.sasl.SaslException: GSS 
initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)] | Client.java:726
{code}

> refreshNamenodes does not support adding a new standby to a running DN
> --
>
> Key: HDFS-8693
> URL: https://issues.apache.org/jira/browse/HDFS-8693
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ha
>Affects Versions: 2.6.0
>Reporter: Jian Fang
>Assignee: Ajith S
>Priority: Critical
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1, 2.8.4
>
> Attachments: HDFS-8693.02.patch, HDFS-8693.03.patch, HDFS-8693.1.patch
>
>
> I tried to run the following command on a Hadoop 2.6.0 cluster with HA 
> support 
> $ hdfs dfsadmin -refreshNamenodes datanode-host:port
> to refresh name nodes on data nodes after I replaced one name node with a new 
> one so that I don't need to restart the data nodes. However, I got the 
> following error:
> refreshNamenodes: HA does not currently support adding a new standby to a 
> running DN. Please do a rolling restart of DNs to reconfigure the list of NNs.
> I checked the 2.6.0 code and the error was thrown by the following code 
> snippet, which led me to this JIRA.
> void refreshNNList(ArrayList addrs) throws IOException {
> Set oldAddrs = Sets.newHashSet();
> for (BPServiceActor actor : bpServices)
> { oldAddrs.add(actor.getNNSocketAddress()); }
> Set newAddrs = Sets.newHashSet(addrs);
> if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty())
> { // Keep things simple for now -- we can implement this at a later date. 
> throw new IOException( "HA does not currently support adding a new standby to 
> a running DN. " + "Please do a rolling restart of DNs to reconfigure the list 
> of NNs."); }
> }
> Looks like this the refreshNameNodes command is an uncompleted feature. 
> Unfortunately, the new name node on a replacement is critical for auto 
> provisioning a hadoop cluster with HDFS HA support. Without this support, the 
> HA feature could not really be used. I also observed that the new standby 
> name node on the replacement instance could stuck in safe mode because no 
> data nodes check in with it. Even with a rolling restart, it may take quite 
> some time to restart all data nodes if we have a big cluster, for example, 
> with 4000 data nodes, let alone restarting DN is way too intrusive and it is 
> not a preferable operation in production. It also increases the chance for a 
> double failure because the standby name node is not really ready for a 
> failover in the case that the current active name node fails. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN

2017-12-10 Thread lindongdong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16285590#comment-16285590
 ] 

lindongdong commented on HDFS-8693:
---

Good job! Wait for good news~~ :)

> refreshNamenodes does not support adding a new standby to a running DN
> --
>
> Key: HDFS-8693
> URL: https://issues.apache.org/jira/browse/HDFS-8693
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ha
>Affects Versions: 2.6.0
>Reporter: Jian Fang
>Assignee: Ajith S
>Priority: Critical
> Attachments: HDFS-8693.02.patch, HDFS-8693.03.patch, HDFS-8693.1.patch
>
>
> I tried to run the following command on a Hadoop 2.6.0 cluster with HA 
> support 
> $ hdfs dfsadmin -refreshNamenodes datanode-host:port
> to refresh name nodes on data nodes after I replaced one name node with a new 
> one so that I don't need to restart the data nodes. However, I got the 
> following error:
> refreshNamenodes: HA does not currently support adding a new standby to a 
> running DN. Please do a rolling restart of DNs to reconfigure the list of NNs.
> I checked the 2.6.0 code and the error was thrown by the following code 
> snippet, which led me to this JIRA.
> void refreshNNList(ArrayList addrs) throws IOException {
> Set oldAddrs = Sets.newHashSet();
> for (BPServiceActor actor : bpServices)
> { oldAddrs.add(actor.getNNSocketAddress()); }
> Set newAddrs = Sets.newHashSet(addrs);
> if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty())
> { // Keep things simple for now -- we can implement this at a later date. 
> throw new IOException( "HA does not currently support adding a new standby to 
> a running DN. " + "Please do a rolling restart of DNs to reconfigure the list 
> of NNs."); }
> }
> Looks like this the refreshNameNodes command is an uncompleted feature. 
> Unfortunately, the new name node on a replacement is critical for auto 
> provisioning a hadoop cluster with HDFS HA support. Without this support, the 
> HA feature could not really be used. I also observed that the new standby 
> name node on the replacement instance could stuck in safe mode because no 
> data nodes check in with it. Even with a rolling restart, it may take quite 
> some time to restart all data nodes if we have a big cluster, for example, 
> with 4000 data nodes, let alone restarting DN is way too intrusive and it is 
> not a preferable operation in production. It also increases the chance for a 
> double failure because the standby name node is not really ready for a 
> failover in the case that the current active name node fails. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11576) Block recovery will fail indefinitely if recovery time > heartbeat interval

2017-09-01 Thread lindongdong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150198#comment-16150198
 ] 

lindongdong commented on HDFS-11576:


I met this problem, and wait for the patch. Thanks for Lukas :)

> Block recovery will fail indefinitely if recovery time > heartbeat interval
> ---
>
> Key: HDFS-11576
> URL: https://issues.apache.org/jira/browse/HDFS-11576
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs, namenode
>Affects Versions: 2.7.1, 2.7.2, 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Attachments: HDFS-11576.001.patch, HDFS-11576.002.patch, 
> HDFS-11576.003.patch, HDFS-11576.004.patch, HDFS-11576.005.patch, 
> HDFS-11576.006.patch, HDFS-11576.007.patch, HDFS-11576.008.patch, 
> HDFS-11576.009.patch, HDFS-11576.010.patch, HDFS-11576.repro.patch
>
>
> Block recovery will fail indefinitely if the time to recover a block is 
> always longer than the heartbeat interval. Scenario:
> 1. DN sends heartbeat 
> 2. NN sends a recovery command to DN, recoveryID=X
> 3. DN starts recovery
> 4. DN sends another heartbeat
> 5. NN sends a recovery command to DN, recoveryID=X+1
> 6. DN calls commitBlockSyncronization after succeeding with first recovery to 
> NN, which fails because X < X+1
> ... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org