[jira] [Commented] (HDFS-9070) Allow fsck display pending replica location information for being-written blocks

2015-10-17 Thread J.Andreina (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962176#comment-14962176
 ] 

J.Andreina commented on HDFS-9070:
--

Thanks [~demongaorui] ,recent changes Looks good.

Some nits:
1. 
{code}
+} else if (isComplete && corruptReplicas != null
+&& corruptReplicas.contains(dnDesc)) {
+  sb.append("CORRUPT)");
+} else if (isComplete && blocksExcess != null
+&& blocksExcess.contains(storedBlock)) {
+  sb.append("EXCESS)");
{code}
I don't think *isComplete()* check is required in above code . 
Block can be CORRUPT or EXCESS, logically possible only if it is complete. So 
explicit check might not be required.

2. Test {{testFsckOpenECFiles}} is written for EC files, but changes are not 
specific to EC file, and there is nothing asserted against EC related. Hence 
when the test is run, it takes lot of time, since it involves 10 DNs.
 IMO, its okay to test samething with normal non-EC file.


> Allow fsck display pending replica location information for being-written 
> blocks
> 
>
> Key: HDFS-9070
> URL: https://issues.apache.org/jira/browse/HDFS-9070
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: GAO Rui
>Assignee: GAO Rui
> Attachments: HDFS-9070--HDFS-7285.00.patch, 
> HDFS-9070-HDFS-7285.00.patch, HDFS-9070-HDFS-7285.01.patch, 
> HDFS-9070-HDFS-7285.02.patch, HDFS-9070-trunk.03.patch, 
> HDFS-9070-trunk.04.patch, HDFS-9070-trunk.05.patch, HDFS-9070-trunk.06.patch
>
>
> When a EC file is being written, it can be helpful to allow fsck display 
> datanode information of the being-written EC file block group. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9234) WebHdfs : getContentSummary() should give quota for storage types

2015-10-17 Thread J.Andreina (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962173#comment-14962173
 ] 

J.Andreina commented on HDFS-9234:
--

Thanks [~surendrasingh] for the patch . It looks overall good. 

One small nit:
 We can make test run little faster by skipping start of datanode, as we are 
not creating any files.
You can start 0 datanode, this will help test run faster.
{code}
try {
  cluster = new MiniDFSCluster.Builder(conf).numDataNodes(0).build();
{code}

> WebHdfs : getContentSummary() should give quota for storage types
> -
>
> Key: HDFS-9234
> URL: https://issues.apache.org/jira/browse/HDFS-9234
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9234-001.patch, HDFS-9234-002.patch, 
> HDFS-9234-003.patch, HDFS-9234-004.patch
>
>
> Currently webhdfs API for ContentSummary give only namequota and spacequota 
> but it will not give storage types quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962170#comment-14962170
 ] 

Hadoop QA commented on HDFS-8673:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746128/HDFS-8673.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58590fe |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13040/console |


This message was automatically generated.

> HDFS reports file already exists if there is a file/dir name end with 
> ._COPYING_
> 
>
> Key: HDFS-8673
> URL: https://issues.apache.org/jira/browse/HDFS-8673
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Chen He
>Assignee: Chen He
> Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
> HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch, 
> HDFS-8673.003.patch
>
>
> Because CLI is using CommandWithDestination.java which add "._COPYING_" to 
> the tail of file name when it does the copy. It will cause problem if there 
> is a file/dir already called *._COPYING_ on HDFS.
> For file:
> -bash-4.1$ hadoop fs -put 5M /user/occ/
> -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
> /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -put 128K /user/occ/5M
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
> For dir:
> -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
> /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -put 128K /user/occ/5M
> put: /user/occ/5M._COPYING_ already exists as a directory
> -bash-4.1$ hadoop fs -ls /user/occ/
> (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-10-17 Thread J.Andreina (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962168#comment-14962168
 ] 

J.Andreina commented on HDFS-8673:
--

I am just worried that, it would break existing behavior of overwriting those 
temp files,  [~airbots]. If everyone agrees, then I am okay for this changes, 
provided Jira is marked as incompatible change.

> HDFS reports file already exists if there is a file/dir name end with 
> ._COPYING_
> 
>
> Key: HDFS-8673
> URL: https://issues.apache.org/jira/browse/HDFS-8673
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.0
>Reporter: Chen He
>Assignee: Chen He
> Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
> HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch, 
> HDFS-8673.003.patch
>
>
> Because CLI is using CommandWithDestination.java which add "._COPYING_" to 
> the tail of file name when it does the copy. It will cause problem if there 
> is a file/dir already called *._COPYING_ on HDFS.
> For file:
> -bash-4.1$ hadoop fs -put 5M /user/occ/
> -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
> /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -put 128K /user/occ/5M
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
> For dir:
> -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -ls /user/occ/
> Found 1 items
> drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
> /user/occ/5M._COPYING_
> -bash-4.1$ hadoop fs -put 128K /user/occ/5M
> put: /user/occ/5M._COPYING_ already exists as a directory
> -bash-4.1$ hadoop fs -ls /user/occ/
> (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9229) Expose size of NameNode directory as a metric

2015-10-17 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9229:

Fix Version/s: (was: 2.8.0)

> Expose size of NameNode directory as a metric
> -
>
> Key: HDFS-9229
> URL: https://issues.apache.org/jira/browse/HDFS-9229
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-9229.001.patch
>
>
> Useful for admins in reserving / managing NN local file system space. Also 
> useful when transferring NN backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962145#comment-14962145
 ] 

Mingliang Liu commented on HDFS-9184:
-

The failing tests seem unrelated and can pass locally (Linux and Mac).

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9252) Change TestFileTruncate to use FsDatasetTestUtils to get block file size and genstamp.

2015-10-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962134#comment-14962134
 ] 

Colin Patrick McCabe commented on HDFS-9252:


Thanks, [~eddyxu].

{code}
+ if (blockFile.equals(listdir[j])) {
{code}

It seems like 
{{blockFile.getCanonicalPath().equals(listdir[j].getCanonicalPath())}} would be 
better. See http://stackoverflow.com/questions/8930859/java-file-equals

{code}
140 
141   /**
142* Get the length of the underlying data file.
143*/
144   long getDataLength(ExtendedBlock eb) throws IOException;
145 
146   /**
147* Get the generation stamp from the persistent stored metadata file.
148*/
149   long getPersistentGenerationStamp(ExtendedBlock block) throws 
IOException;
{code}

The {{ExtendedBlock}} structure has both a length and a genstamp field.  Maybe 
it would be clearer if these methods were named {{getStoredDataLength}} and 
{{getStoredGenerationStamp}}?  Also the Javadoc should make it clear that they 
are getting the stored length and genstamp, and probably avoid references to 
"the underlying file" (some {{FSDatasetSpi}} implementations don't use files)

{code}
820assertEquals(utils.getPersistentGenerationStamp(newBlock.getBlock()),
821 newBlock.getBlock().getGenerationStamp());
{code}
Hmm.  It seems like the order is reversed here, right?  In {{assertEquals}}, 
the thing that we "expect" to see should come first, not second.

> Change TestFileTruncate to use FsDatasetTestUtils to get block file size and 
> genstamp.
> --
>
> Key: HDFS-9252
> URL: https://issues.apache.org/jira/browse/HDFS-9252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9252.00.patch
>
>
> {{TestFileTruncate}} verifies block size and genstamp by directly accessing 
> the  local filesystem, e.g.:
> {code}
> assertTrue(cluster.getBlockMetadataFile(dn0,
>newBlock.getBlock()).getName().endsWith(
>newBlock.getBlock().getGenerationStamp() + ".meta"));
> {code}
> Lets abstract the fsdataset-special logic behind FsDatasetTestUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962088#comment-14962088
 ] 

Hadoop QA commented on HDFS-9184:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  21m 18s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 57s | The applied patch generated  5 
new checkstyle issues (total was 402, now 406). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 46s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 44s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   8m 19s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  66m 58s | Tests failed in hadoop-hdfs. |
| | | 126m  4s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767223/HDFS-9184.008.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58590fe |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13039/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13039/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13039/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13039/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13039/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13039/console |


This message was automatically generated.

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant ov

[jira] [Updated] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-17 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9184:

Attachment: HDFS-9184.008.patch

Per offline discussion, the {{signature}} is changed from {{String}} to 
{{byte[]}}. We also use {{LogCapture}} for testing end-to-end caller context 
logging. 

> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9229) Expose size of NameNode directory as a metric

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961935#comment-14961935
 ] 

Hadoop QA commented on HDFS-9229:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  27m  8s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m  3s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m 17s | Site still builds. |
| {color:red}-1{color} | checkstyle |   2m 51s | The applied patch generated  4 
new checkstyle issues (total was 421, now 424). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 43s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   8m 28s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  65m 31s | Tests failed in hadoop-hdfs. |
| | | 134m 26s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
| Timed out tests | org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767211/HDFS-9229.001.patch |
| Optional Tests | site javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58590fe |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13038/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13038/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13038/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13038/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13038/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13038/console |


This message was automatically generated.

> Expose size of NameNode directory as a metric
> -
>
> Key: HDFS-9229
> URL: https://issues.apache.org/jira/browse/HDFS-9229
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-9229.001.patch
>
>
> Useful for admins in reserving / managing NN local file system space. Also 
> useful when transferring NN backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9229) Expose size of NameNode directory as a metric

2015-10-17 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-9229:
-
Status: Patch Available  (was: Open)

> Expose size of NameNode directory as a metric
> -
>
> Key: HDFS-9229
> URL: https://issues.apache.org/jira/browse/HDFS-9229
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-9229.001.patch
>
>
> Useful for admins in reserving / managing NN local file system space. Also 
> useful when transferring NN backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9229) Expose size of NameNode directory as a metric

2015-10-17 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-9229:
-
Attachment: HDFS-9229.001.patch

Attached Patch.
Please review...

> Expose size of NameNode directory as a metric
> -
>
> Key: HDFS-9229
> URL: https://issues.apache.org/jira/browse/HDFS-9229
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HDFS-9229.001.patch
>
>
> Useful for admins in reserving / managing NN local file system space. Also 
> useful when transferring NN backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9241) HDFS clients can't construct HdfsConfiguration instances

2015-10-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961842#comment-14961842
 ] 

Steve Loughran commented on HDFS-9241:
--

to clarify then

# the thin client doesn't have a backwards compatible way to force load 
hdfs-site
# the "two lines of code"  proposed as a workaround is in fact package private
# the reason for introducing this change is because there are some deprecated 
tags

I don't want to come over all [~aw] here but this isn't justification for 
making things incompatible. And while, yes, I can include the hdfs all JAR, 
that misses the purpose of the hdfs-client package: to produce a leaner 
client-side only package.

We've long hand a problem in HDFS, that whereas Yarn's {{YarnConfiguration}}, 
and indeed {{JobConfiguration}} have been public with stable string constants 
designed for public consumption, the sole set of string constants in HDFS have 
been considered private, free to change on a whim. Either we downstream 
developers  end up importing and something which has a history of being broken 
(HDFS-6418) on the grounds that "people downstream should have cut and paste 
strings into their source". At least when people use {{DFSConfigKeys}} values, 
you can use the IDE to find the load points.

I propose 
# accepting that yes, there are deprecated constants in {{DFSConfigKeys}}, but 
they are used in client apps
# moving it and the HdfsConfiguration class into hdfs-client

It's not going to add new dependencies, and it will retain compatibility. 

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9241
> URL: https://issues.apache.org/jira/browse/HDFS-9241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9241) HDFS clients can't construct HdfsConfiguration instances

2015-10-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961839#comment-14961839
 ] 

Steve Loughran commented on HDFS-9241:
--

bq. My answer is no – the current implementation has a class 
HdfsConfigurationLoader to load the configurations that serves the original 
purposes of HdfsConfiguration on the client side.

You've just created an incompatible change on branch-2. I've hit this problem 
in Slider, which we build against 2.6, only now find it doesn't. Others may 
have similar problem.

bq. The reason is that HdfsConfiguration are used by both the client and the 
server side. It contains deprecated keys for the server side, which IMO should 
not be exposed to the clients at all.

welll, they are tagged as deprecated. Again, they may get used.

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9241
> URL: https://issues.apache.org/jira/browse/HDFS-9241
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8471) Add read block support for DataNode HTTP/2 server

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961818#comment-14961818
 ] 

Hadoop QA commented on HDFS-8471:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  1s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767204/HDFS-8471-v8.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58590fe |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13037/console |


This message was automatically generated.

> Add read block support for DataNode HTTP/2 server
> -
>
> Key: HDFS-8471
> URL: https://issues.apache.org/jira/browse/HDFS-8471
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: HDFS-7966
>
> Attachments: HDFS-8471-v6.patch, HDFS-8471-v7.patch, 
> HDFS-8471-v8.patch, HDFS-8471.1.patch, HDFS-8471.2.patch, HDFS-8471.3.patch, 
> HDFS-8471.4.patch, HDFS-8471.5.patch, HDFS-8471.patch
>
>
> Based on the streamed channel introduced in HDFS-8515.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8471) Add read block support for DataNode HTTP/2 server

2015-10-17 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HDFS-8471:

Attachment: HDFS-8471-v8.patch

Rebase and add os cache management.

> Add read block support for DataNode HTTP/2 server
> -
>
> Key: HDFS-8471
> URL: https://issues.apache.org/jira/browse/HDFS-8471
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: HDFS-7966
>
> Attachments: HDFS-8471-v6.patch, HDFS-8471-v7.patch, 
> HDFS-8471-v8.patch, HDFS-8471.1.patch, HDFS-8471.2.patch, HDFS-8471.3.patch, 
> HDFS-8471.4.patch, HDFS-8471.5.patch, HDFS-8471.patch
>
>
> Based on the streamed channel introduced in HDFS-8515.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9129) Move the safemode block count into BlockManager

2015-10-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961801#comment-14961801
 ] 

Hadoop QA commented on HDFS-9129:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 57s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 5 new or modified test files. |
| {color:green}+1{color} | javac |   8m  7s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 41s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 25s | The applied patch generated  4 
new checkstyle issues (total was 626, now 575). |
| {color:green}+1{color} | whitespace |   0m  3s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 32s | The patch appears to introduce 2 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  51m 12s | Tests failed in hadoop-hdfs. |
| | |  97m 41s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestRecoverStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767196/HDFS-9129.006.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58590fe |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13036/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13036/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13036/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13036/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13036/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13036/console |


This message was automatically generated.

> Move the safemode block count into BlockManager
> ---
>
> Key: HDFS-9129
> URL: https://issues.apache.org/jira/browse/HDFS-9129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, 
> HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, 
> HDFS-9129.005.patch, HDFS-9129.006.patch
>
>
> The {{SafeMode}} needs to track whether there are enough blocks so that the 
> NN can get out of the safemode. These fields can moved to the 
> {{BlockManager}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)