date:20181222

[jira] [Commented] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop

2018-12-22 Thread Bolke de Bruin (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727824#comment-16727824
 ] 

Bolke de Bruin commented on HADOOP-15996:
-

[~eyang] thanks for the stack trace. I'm trying to setup my own full testing 
env, but currently being on a low bandwidth connection makes this a bit 
challenge.

Still this remains suspicious: KerberosName (and HadoopKerberosName) start out 
with "null" rules. Obviously, in your  environment these get set somewhere. 
This either happens by UserGroupInformation.setConfiguration, 
HadoopKerberosName.setConfiguration or (Hadoop)KerberosName.setRules . There is 
no other way as there is no explicit mapping from 
"hadoop.security.auth_to_local" to "kerberos.name.rules" otherwise.

I'll look into your suggestion in the meantime.

 

> Plugin interface to support more complex usernames in Hadoop
> 
>
> Key: HADOOP-15996
> URL: https://issues.apache.org/jira/browse/HADOOP-15996
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: security
>Reporter: Eric Yang
>Assignee: Bolke de Bruin
>Priority: Major
> Attachments: 0001-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0001-Simple-trial-of-using-krb5.conf-for-auth_to_local-ru.patch, 
> 0002-HADOOP-15996-Make-auth-to-local-configurable.patch, 
> 0003-HADOOP-15996-Make-auth-to-local-configurable.patch
>
>
> Hadoop does not allow support of @ character in username in recent security 
> mailing list vote to revert HADOOP-12751.  Hadoop auth_to_local rule must 
> match to authorize user to login to Hadoop cluster.  This design does not 
> work well in multi-realm environment where identical username between two 
> realms do not map to the same user.  There is also possibility that lossy 
> regex can incorrectly map users.  In the interest of supporting multi-realms, 
> it maybe preferred to pass principal name without rewrite to uniquely 
> distinguish users.  This jira is to revisit if Hadoop can support full 
> principal names without rewrite and provide a plugin to override Hadoop's 
> default implementation of auth_to_local for multi-realm use case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2018-12-22 Thread Kai X (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727798#comment-16727798
 ] 

Kai X edited comment on HADOOP-16018 at 12/23/18 4:59 AM:
--

I can observe `BLOCKS_PER_CHUNK.getConfigLabel()` is used for the first time in 
HADOOP-15850 and then hits this regression.

 

Also verified these in my cluster
 * hadoop-distcp-2.9.1 can reassemble chunks (can be used as a workaround)
 * 2.9.2 cannot, debug log in CopyCommitter ctor always prints "blocks per 
chunk 0"
 * 2.9.2 with the patch applied can reassemble chunks, debug log can print the 
correct value for blocks per chunk.

 

 


was (Author: kai33):
I can observe `BLOCKS_PER_CHUNK.getConfigLabel()` is used for the first time in 
HADOOP-15850 and then hits this regression.

 

Also verified these in my cluster
 * hadoop-distcp-2.9.1 can reassemble chunks (can be used as a workaround)
 * 2.9.2 cannot
 * 2.9.2 with the patch applied can reassemble chunks

 

 

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai X
>Priority: Major
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2018-12-22 Thread Kai X (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727798#comment-16727798
 ] 

Kai X commented on HADOOP-16018:


I can observe `BLOCKS_PER_CHUNK.getConfigLabel()` is used for the first time in 
HADOOP-15850 and then hits this regression.

 

Also verified these in my cluster
 * hadoop-distcp-2.9.1 can reassemble chunks (can be used as a workaround)
 * 2.9.2 cannot
 * 2.9.2 with the patch applied can reassemble chunks

 

 

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai X
>Priority: Major
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop

2018-12-22 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727543#comment-16727543
 ] 

Eric Yang commented on HADOOP-15996:


[~bolke] The latest patch works for YARN REST API, but it fails on hdfs api and 
hdfs ipc calls.  It looks like when UserGroupinformation instance is 
constructed via doAs or reflection, then rule mechanism is not set.  Here are 
some stack trace that shows the calling stack using 0003 patch:

Running: hdfs dfs -ls /
{code}
2018-12-22 19:16:39,598 WARN ipc.Client: Couldn't setup connection for 
l...@example.com to eyang-1.example.com/172.01.111.17:9000
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1817)
at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:361)
at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:617)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:411)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:804)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:800)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:800)
at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.Client.call(Client.java:1367)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:903)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1665)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1582)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:65)
at org.apache.hadoop.fs.Globber.doGlob(Globber.java:294)
at org.apache.hadoop.fs.Globber.glob(Globber.java:149)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2015)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:353)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:250)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:233)
at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:104)
at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:327)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
ls: DestHost:destPort eyang-1.openstacklocal:9000 , LocalHost:localPort 
eyang-1.openstacklocal/172.26.111.17:0. Failed on local exception: 
java.io.IOException: Couldn't setup connection for l...@example.com to 
eyang-1.openstacklocal/172.01.111.17:9000
{code}

Running: curl --negotiate -u : http://eyang-1.example.com:50070/webhdfs/v1/
{code}

[jira] [Commented] (HADOOP-16005) NativeAzureFileSystem does not support setXAttr

2018-12-22 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727540#comment-16727540
 ] 

ASF GitHub Bot commented on HADOOP-16005:
-

GitHub user c-w opened a pull request:

https://github.com/apache/hadoop/pull/452

HADOOP-16005: Add XAttr support to WASB and ABFS

As discussed in 
[HADOOP-16005](https://issues.apache.org/jira/browse/HADOOP-16005), this pull 
request implements `getXAttr` and `setXAttr` on hadoop-azure's WASB and ABFS 
file-systems.

The changes were tested against the following Azure storage account 
configurations:

- WASB: StorageV2, RA-GRS replication in East US (primary) West US 
(secondary). [WASB test session 
screenshot](https://user-images.githubusercontent.com/1086421/50362109-699f5a00-0534-11e9-97c9-e8a7cee6e6c6.png).
 All tests pass and the ABFS tests are skipped as expected.

- ABFS: StorageV2 with Data Lake Storage Gen2 preview enabled, RA-GRS 
replication in East US (primary) West US (secondary). [ABFS test session 
screenshot](https://user-images.githubusercontent.com/1086421/50361278-fea05400-0530-11e9-9cb4-cc23dec87cfc.png).
 All ABFS tests pass but the WASB tests fail since the storage account hasn't 
implemented the blob endpoints yet.

The test-patch script passed: [test-patch 
output](https://user-images.githubusercontent.com/1086421/50377952-50aaad80-05f5-11e9-8ea2-b7bf99fc7509.png).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CatalystCode/hadoop hadoop-16005

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit 1c8303a5af1016455d23ce78508f911a10af4e77
Author: Clemens Wolff 
Date:   2018-12-20T21:30:56Z

Add setXAttr and getXAttr to WASB and ABFS




> NativeAzureFileSystem does not support setXAttr
> ---
>
> Key: HADOOP-16005
> URL: https://issues.apache.org/jira/browse/HADOOP-16005
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Clemens Wolff
>Priority: Major
>
> When interacting with Azure Blob Storage via the Hadoop FileSystem client, 
> it's currently (as of 
> [a8bbd81|https://github.com/apache/hadoop/commit/a8bbd818d5bc4762324bcdb7cf1fdd5c2f93891b])
>  not possible to set custom metadata attributes.
> Here is a snippet that demonstrates the missing behavior (throws an 
> UnsupportedOperationException):
> {code:java}
> val blobAccount = "SET ME"
> val blobKey = "SET ME"
> val blobContainer = "SET ME"
> val blobFile = "SET ME"
> import org.apache.hadoop.conf.Configuration
> import org.apache.hadoop.fs.{FileSystem, Path}
> val conf = new Configuration()
> conf.set("fs.wasbs.impl", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
> conf.set(s"fs.azure.account.key.$blobAccount.blob.core.windows.net", blobKey)
> val path = new 
> Path(s"wasbs://$blobContainer@$blobAccount.blob.core.windows.net/$blobFile")
> val fs = FileSystem.get(path, conf)
> fs.setXAttr(path, "somekey", "somevalue".getBytes)
> {code}
> Looking at the code in hadoop-tools/hadoop-azure, NativeAzureFileSystem 
> inherits the default setXAttr from FileSystem which throws the 
> UnsupportedOperationException.
> The underlying Azure Blob Storage service does support custom metadata 
> ([service 
> docs|https://docs.microsoft.com/en-us/azure/storage/blobs/storage-properties-metadata])
>  as does the azure-storage SDK that's being used by NativeAzureFileSystem 
> ([SDK 
> docs|http://javadox.com/com.microsoft.azure/azure-storage/2.0.0/com/microsoft/azure/storage/blob/CloudBlob.html#setMetadata(java.util.HashMap)]).
> Is there another way that I should be setting custom metadata on Azure Blob 
> Storage files? Is there a specific reason why setXAttr hasn't been 
> implemented on NativeAzureFileSystem? If not, I can take a shot at 
> implementing it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] hadoop pull request #452: HADOOP-16005: Add XAttr support to WASB and ABFS

2018-12-22 Thread c-w

GitHub user c-w opened a pull request:

https://github.com/apache/hadoop/pull/452

HADOOP-16005: Add XAttr support to WASB and ABFS

As discussed in 
[HADOOP-16005](https://issues.apache.org/jira/browse/HADOOP-16005), this pull 
request implements `getXAttr` and `setXAttr` on hadoop-azure's WASB and ABFS 
file-systems.

The changes were tested against the following Azure storage account 
configurations:

- WASB: StorageV2, RA-GRS replication in East US (primary) West US 
(secondary). [WASB test session 
screenshot](https://user-images.githubusercontent.com/1086421/50362109-699f5a00-0534-11e9-97c9-e8a7cee6e6c6.png).
 All tests pass and the ABFS tests are skipped as expected.

- ABFS: StorageV2 with Data Lake Storage Gen2 preview enabled, RA-GRS 
replication in East US (primary) West US (secondary). [ABFS test session 
screenshot](https://user-images.githubusercontent.com/1086421/50361278-fea05400-0530-11e9-9cb4-cc23dec87cfc.png).
 All ABFS tests pass but the WASB tests fail since the storage account hasn't 
implemented the blob endpoints yet.

The test-patch script passed: [test-patch 
output](https://user-images.githubusercontent.com/1086421/50377952-50aaad80-05f5-11e9-8ea2-b7bf99fc7509.png).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CatalystCode/hadoop hadoop-16005

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #452


commit 1c8303a5af1016455d23ce78508f911a10af4e77
Author: Clemens Wolff 
Date:   2018-12-20T21:30:56Z

Add setXAttr and getXAttr to WASB and ABFS




---

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2018-12-22 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727525#comment-16727525
 ] 

Ted Yu commented on HADOOP-16018:
-

Looking at 
https://github.com/apache/hadoop/commits/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptionSwitch.java
 , it was not touched by HADOOP-15850

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai X
>Priority: Major
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-16014) Fix test, checkstyle and javadoc issues in TestKerberosAuthenticationHandler

2018-12-22 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HADOOP-16014:

Fix Version/s: (was: 3.2.1)
   3.3.0

> Fix test, checkstyle and javadoc issues in TestKerberosAuthenticationHandler
> 
>
> Key: HADOOP-16014
> URL: https://issues.apache.org/jira/browse/HADOOP-16014
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-16014.001.patch, HADOOP-16014.002.patch
>
>
> TestKerberosAuthenticationHandler has multiple checkstyle violations, missing 
> javadoc and some tests are not annotated with @Test thus not being run.
> This jira aims to fix above issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

2018-12-22 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HADOOP-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727515#comment-16727515
 ] 

Wei-Chiu Chuang commented on HADOOP-16018:
--

Hi Kai X, thanks for reporting the issue.
Was this broken by HADOOP-15850? [~yuzhih...@gmail.com] FYI

> DistCp won't reassemble chunks when blocks per chunk > 0
> 
>
> Key: HADOOP-16018
> URL: https://issues.apache.org/jira/browse/HADOOP-16018
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: tools/distcp
>Affects Versions: 3.2.0, 2.9.2
>Reporter: Kai X
>Priority: Major
>
> I was investigating why hadoop-distcp-2.9.2 won't reassemble chunks of the 
> same file when blocks per chunk has been set > 0.
> In the CopyCommitter::commitJob, this logic can prevent chunks from 
> reassembling if blocks per chunk is equal to 0:
> {code:java}
> if (blocksPerChunk > 0) {
>   concatFileChunks(conf);
> }
> {code}
> Then in CopyCommitter's ctor, blocksPerChunk is initialised from the config:
> {code:java}
> blocksPerChunk = context.getConfiguration().getInt(
> DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel(), 0);
> {code}
>  
> But here the config key DistCpOptionSwitch.BLOCKS_PER_CHUNK.getConfigLabel() 
> will always returns empty string because it is constructed without config 
> label:
> {code:java}
> BLOCKS_PER_CHUNK("",
> new Option("blocksperchunk", true, "If set to a positive value, files"
> + "with more blocks than this value will be split into chunks of "
> + " blocks to be transferred in parallel, and "
> + "reassembled on the destination. By default,  is "
> + "0 and the files will be transmitted in their entirety without "
> + "splitting. This switch is only applicable when the source file "
> + "system implements getBlockLocations method and the target file "
> + "system implements concat method"))
> {code}
> As a result it will fall back to the default value 0 for blocksPerChunk, and 
> prevent the chunks from reassembling.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop

[jira] [Comment Edited] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

[jira] [Commented] (HADOOP-15996) Plugin interface to support more complex usernames in Hadoop

[jira] [Commented] (HADOOP-16005) NativeAzureFileSystem does not support setXAttr

[GitHub] hadoop pull request #452: HADOOP-16005: Add XAttr support to WASB and ABFS

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

[jira] [Updated] (HADOOP-16014) Fix test, checkstyle and javadoc issues in TestKerberosAuthenticationHandler

[jira] [Commented] (HADOOP-16018) DistCp won't reassemble chunks when blocks per chunk > 0

9 matches

Site Navigation

Mail list logo

Footer information