[jira] [Commented] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055882#comment-14055882
 ] 

Jing Zhao commented on HDFS-6645:
-

+1. Thanks [~schu]! I will commit it tomorrow morning. 

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6646) [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and getDatanodeInfo usage is missed

2014-07-09 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6646:


   Resolution: Fixed
Fix Version/s: (was: 3.0.0)
   2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

committed to trunk and branch-2.

Thanks [~brahmareddy].

 [ HDFS Rolling Upgrade - Shell  ] shutdownDatanode and getDatanodeInfo usage 
 is missed
 --

 Key: HDFS-6646
 URL: https://issues.apache.org/jira/browse/HDFS-6646
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.6.0

 Attachments: HDFS-6646.patch, HDFS-6646_1.patch


 Usage message is missed for shutdownDatanode and getdatanodeinfo
 Please check the following for same..(It's printing whole usage for dfsadmin)
 hdfs dfsadmin -shutdownDatanode
 Usage: java DFSAdmin
 Note: Administrative commands can only be run as the HDFS superuser.
[-report]
[-safemode enter | leave | get | wait]
[-allowSnapshot snapshotDir]
[-disallowSnapshot snapshotDir]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-finalizeUpgrade]
[-rollingUpgrade [query|prepare|finalize]]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-printTopology]
[-refreshNamenodes datanodehost:port]
[-deleteBlockPool datanode-host:port blockpoolId [force]]
[-setQuota quota dirname...dirname]
[-clrQuota dirname...dirname]
[-setSpaceQuota quota dirname...dirname]
[-clrSpaceQuota dirname...dirname]
[-setBalancerBandwidth bandwidth in bytes per second]
[-fetchImage local directory]
[-shutdownDatanode datanode_host:ipc_port [upgrade]]
[-getDatanodeInfo datanode_host:ipc_port]
[-help [cmd]]
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6646) [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and getDatanodeInfo usage is missed

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055885#comment-14055885
 ] 

Hudson commented on HDFS-6646:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5848 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5848/])
HDFS-6646. [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and 
getDatanodeInfo usage is missed ( Contributed by Brahma Reddy Battula) 
(vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1609020)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java


 [ HDFS Rolling Upgrade - Shell  ] shutdownDatanode and getDatanodeInfo usage 
 is missed
 --

 Key: HDFS-6646
 URL: https://issues.apache.org/jira/browse/HDFS-6646
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.6.0

 Attachments: HDFS-6646.patch, HDFS-6646_1.patch


 Usage message is missed for shutdownDatanode and getdatanodeinfo
 Please check the following for same..(It's printing whole usage for dfsadmin)
 hdfs dfsadmin -shutdownDatanode
 Usage: java DFSAdmin
 Note: Administrative commands can only be run as the HDFS superuser.
[-report]
[-safemode enter | leave | get | wait]
[-allowSnapshot snapshotDir]
[-disallowSnapshot snapshotDir]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-finalizeUpgrade]
[-rollingUpgrade [query|prepare|finalize]]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-printTopology]
[-refreshNamenodes datanodehost:port]
[-deleteBlockPool datanode-host:port blockpoolId [force]]
[-setQuota quota dirname...dirname]
[-clrQuota dirname...dirname]
[-setSpaceQuota quota dirname...dirname]
[-clrSpaceQuota dirname...dirname]
[-setBalancerBandwidth bandwidth in bytes per second]
[-fetchImage local directory]
[-shutdownDatanode datanode_host:ipc_port [upgrade]]
[-getDatanodeInfo datanode_host:ipc_port]
[-help [cmd]]
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6455) NFS: Exception should be added in NFS log for invalid separator in allowed.hosts

2014-07-09 Thread Abhiraj Butala (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055897#comment-14055897
 ] 

Abhiraj Butala commented on HDFS-6455:
--

Thanks for reviewing the patch [~brandonli]. This is the output of showmount 
command:

{code}
abutala@abutala-vBox:~$ showmount -e 127.0.1.1
rpc mount export: RPC: Timed out
{code}

I don't see any errors or log messages in NFS server output. What should be the 
correct behavior of showmount in this case?

 NFS: Exception should be added in NFS log for invalid separator in 
 allowed.hosts
 

 Key: HDFS-6455
 URL: https://issues.apache.org/jira/browse/HDFS-6455
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.2.0
Reporter: Yesha Vora
 Attachments: HDFS-6455.patch


 The error for invalid separator in dfs.nfs.exports.allowed.hosts property 
 should be added in nfs log file instead nfs.out file.
 Steps to reproduce:
 1. Pass invalid separator in dfs.nfs.exports.allowed.hosts
 {noformat}
 propertynamedfs.nfs.exports.allowed.hosts/namevaluehost1  ro:host2 
 rw/value/property
 {noformat}
 2. restart NFS server. NFS server fails to start and print exception console.
 {noformat}
 [hrt_qa@host1 hwqe]$ ssh -o StrictHostKeyChecking=no -o 
 UserKnownHostsFile=/dev/null host1 sudo su - -c 
 \/usr/lib/hadoop/sbin/hadoop-daemon.sh start nfs3\ hdfs
 starting nfs3, logging to /tmp/log/hadoop/hdfs/hadoop-hdfs-nfs3-horst1.out
 DEPRECATED: Use of this script to execute hdfs command is deprecated.
 Instead use the hdfs command for it.
 Exception in thread main java.lang.IllegalArgumentException: Incorrectly 
 formatted line 'host1 ro:host2 rw'
   at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
   at org.apache.hadoop.nfs.NfsExports.init(NfsExports.java:151)
   at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
   at 
 org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.init(RpcProgramNfs3.java:176)
   at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.init(Nfs3.java:43)
   at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
 {noformat}
 NFS log does not print any error message. It directly shuts down. 
 {noformat}
 STARTUP_MSG:   java = 1.6.0_31
 /
 2014-05-27 18:47:13,972 INFO  nfs3.Nfs3Base (SignalLogger.java:register(91)) 
 - registered UNIX signal handlers for [TERM, HUP, INT]
 2014-05-27 18:47:14,169 INFO  nfs3.IdUserGroup 
 (IdUserGroup.java:updateMapInternal(159)) - Updated user map size:259
 2014-05-27 18:47:14,179 INFO  nfs3.IdUserGroup 
 (IdUserGroup.java:updateMapInternal(159)) - Updated group map size:73
 2014-05-27 18:47:14,192 INFO  nfs3.Nfs3Base (StringUtils.java:run(640)) - 
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down Nfs3 at 
 {noformat}
 NFS.out file has exception.
 {noformat}
 EPRECATED: Use of this script to execute hdfs command is deprecated.
 Instead use the hdfs command for it.
 Exception in thread main java.lang.IllegalArgumentException: Incorrectly 
 formatted line 'host1 ro:host2 rw'
 at org.apache.hadoop.nfs.NfsExports.getMatch(NfsExports.java:356)
 at org.apache.hadoop.nfs.NfsExports.init(NfsExports.java:151)
 at org.apache.hadoop.nfs.NfsExports.getInstance(NfsExports.java:54)
 at 
 org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.init(RpcProgramNfs3.java:176)
 at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.init(Nfs3.java:43)
 at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:59)
 ulimit -a for user hdfs
 core file size  (blocks, -c) 409600
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 188893
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 65536
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-6647:


 Summary: Edit log corruption when pipeline recovery occurs for 
deleted file present in snapshot
 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers


I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the edit 
log for a file after an OP_DELETE has previously been logged for that file. 
Such an edit log sequence cannot then be successfully read by the NameNode.

More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-6647:
-

Attachment: HDFS-6647-failing-test.patch

I'm attaching a test case which illustrates the problem. When this problem 
occurs, the NN will fail to be able to read the edit log and will fail to start 
with an error like the following:

{noformat}
java.io.FileNotFoundException: File does not exist: /test-file
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:64)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:54)
  at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:444)
  at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:227)
  
  at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:136)
  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:816)
  at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:676)
  at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:279)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:964)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:711)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:530)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:586)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:752)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:736)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1412)
{noformat}

The sequence of events that I've identified that can cause this are the 
following:

# A file is opened for write and some data has been written/flushed to it, 
causing a block to be allocated.
# A snapshot is taken which includes the file.
# The file is deleted from the present file system, though the client has not 
yet closed the file. This will log an OP_DELETE to the edit log.
# Some error happens triggering pipeline recovery, which log an 
OP_UPDATE_BLOCKS to the edit log.

The reason it's possible for this to happen is basically because the 
{{updatePipeline}} RPC never checks if the file actually exists, but instead 
just finds the file INode based on the block ID being replaced in the pipeline. 
Later, when we're reading the {{OP_UPDATE_BLOCKS}} from the edit log, however, 
we try to find the file INode based on the path name of the file, which no 
longer exists because of the previous delete.

It's not entirely obvious to me what the right solution to this issue should 
be. It shouldn't be difficult to change the {{FSEditLogLoader}} to be able to 
read the {{OP_UPDATE_BLOCKS}} op if we just change it to look up the INode by 
block ID. On the other hand, however, I'm not entirely sure we should even be 
allowing this sequence of edit log ops in the first place. It doesn't seem 
unreasonable to me that we might check that the file actually exists in the 
present file system in the {{updatePipeline}} RPC call and throw an error if it 
doesn't, since continuing to write to a file that only exists in a snapshot 
doesn't make much sense. Along similar lines, it seems a little odd to me that 
an INode that only exists in the snapshot would continue to be considered 
under-construction, but perhaps that's not unreasonable in itself.

Would love to hear others' thoughts on this.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
 Attachments: HDFS-6647-failing-test.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-6647:
-

Priority: Blocker  (was: Major)

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.

2014-07-09 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056006#comment-14056006
 ] 

Liang Xie commented on HDFS-6631:
-

I see, compared with my dev box logfile, found inside attached 
org.apache.hadoop.hdfs.TestPread-output.txt file, it did not trigger a real 
hedged read.
I only could find  log like Waited 50ms to read from 127.0.0.1:x spawning 
hedged read in my logfile.
In your file, the execution sequence is :
-read from 127.0.0.1:53908 - here the counter is 1
-throw Checksum Exception
-read from 127.0.0.1:53919 - here the counter is 2
-return result ,  line 1127
that means all two read path gone to the if (futures.isEmpty()) { flow  
(L1112)

so the root question is if we set hedged.read.threshold = 50ms, and 
Mockito.doAnswer has a Thread.sleep(50+1), this statement:
{code}
  FutureByteBuffer future = hedgedService.poll(
  dfsClient.getHedgedReadTimeout(), TimeUnit.MILLISECONDS);
{code}

In my dev box, it did just like Javadoc says:
{code}
Retrieves and removes the Future representing the next completed task, waiting 
if necessary up to the 
 specified wait time if none are yet present.
Parameters:
timeout how long to wait before giving up, in units of unit
unit a TimeUnit determining how to interpret the timeout parameter
Returns:
 the Future representing the next completed task or null if the 
specified waiting time elapses before 
  one is present
Throws:
InterruptedException - if interrupted while waiting
{code}

so the future will be null.

but in Chris's box, the exception from thread pool will jump out firstly, so 
gone to L1140 directly: catch (ExecutionException e)

so per my current understanding, it should be related with os thread schedule 
(granularity) , we probably need to enlarge the Mockito sleep interval.

 TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
 --

 Key: HDFS-6631
 URL: https://issues.apache.org/jira/browse/HDFS-6631
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
 Attachments: org.apache.hadoop.hdfs.TestPread-output.txt


 {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently.  It looks 
 like a race condition on counting the expected number of loop iterations.  I 
 can repro the test failure more consistently on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2014-07-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056010#comment-14056010
 ] 

Steve Loughran commented on HDFS-6634:
--

I'd argue that making the audit log generally accessible via some kafka stream 
would be broadly useful, as it would support other use cases being discussed 
such as identify and delete temporary files, or, as spotify do with their 
audit log -identify files that haven't been read for a while.

If custom code is needed, that's what we call a library

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Attachments: inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.

2014-07-09 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6631:


Attachment: HDFS-6631.txt

 TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
 --

 Key: HDFS-6631
 URL: https://issues.apache.org/jira/browse/HDFS-6631
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
 Attachments: HDFS-6631.txt, 
 org.apache.hadoop.hdfs.TestPread-output.txt


 {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently.  It looks 
 like a race condition on counting the expected number of loop iterations.  I 
 can repro the test failure more consistently on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.

2014-07-09 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6631:


Status: Patch Available  (was: Open)

 TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
 --

 Key: HDFS-6631
 URL: https://issues.apache.org/jira/browse/HDFS-6631
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
 Attachments: HDFS-6631.txt, 
 org.apache.hadoop.hdfs.TestPread-output.txt


 {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently.  It looks 
 like a race condition on counting the expected number of loop iterations.  I 
 can repro the test failure more consistently on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.

2014-07-09 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056019#comment-14056019
 ] 

Liang Xie commented on HDFS-6631:
-

I just uploaded a tentitive patch, [~cnauroth], could you try it in your 
easy-repro env ? thank you very much!

 TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
 --

 Key: HDFS-6631
 URL: https://issues.apache.org/jira/browse/HDFS-6631
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
 Attachments: HDFS-6631.txt, 
 org.apache.hadoop.hdfs.TestPread-output.txt


 {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently.  It looks 
 like a race condition on counting the expected number of loop iterations.  I 
 can repro the test failure more consistently on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.

2014-07-09 Thread Liang Xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie reassigned HDFS-6631:
---

Assignee: Liang Xie

 TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
 --

 Key: HDFS-6631
 URL: https://issues.apache.org/jira/browse/HDFS-6631
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
Assignee: Liang Xie
 Attachments: HDFS-6631.txt, 
 org.apache.hadoop.hdfs.TestPread-output.txt


 {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently.  It looks 
 like a race condition on counting the expected number of loop iterations.  I 
 can repro the test failure more consistently on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056079#comment-14056079
 ] 

Hudson commented on HDFS-6614:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-6614. Addendum patch to shorten TestPread run time with smaller retry 
timeout setting. Contributed by Liang Xie. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608846)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java


 shorten TestPread run time with a smaller retry timeout setting
 ---

 Key: HDFS-6614
 URL: https://issues.apache.org/jira/browse/HDFS-6614
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0, 2.5.0
Reporter: Liang Xie
Assignee: Liang Xie
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt


 Just notice logs like this from TestPread:
 DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec
 so i tried to set a smaller retry window value.
 Before patch:
  T E S T S
 ---
 Running org.apache.hadoop.hdfs.TestPread
 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - 
 in org.apache.hadoop.hdfs.TestPread
 After the change:
  T E S T S
 ---
 Running org.apache.hadoop.hdfs.TestPread
 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - 
 in org.apache.hadoop.hdfs.TestPread



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6638) shorten test run time with a smaller retry timeout setting

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056077#comment-14056077
 ] 

Hudson commented on HDFS-6638:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-6638. Shorten test run time with a smaller retry timeout setting. 
Contributed by Liang Xie. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608905)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocalLegacy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientReportBadBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestCrcCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptedTransfer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMissingBlocksAlert.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailoverWithBlockTokensEnabled.java


 shorten test run time with a smaller retry timeout setting
 --

 Key: HDFS-6638
 URL: https://issues.apache.org/jira/browse/HDFS-6638
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0, 2.5.0
Reporter: Liang Xie
Assignee: Liang Xie
 Fix For: 3.0.0, 2.6.0

 Attachments: HDFS-6638.txt


 similiar with HDFS-6614, i think it's a general test duration optimization 
 tip, so i grep IOException, will wait for from a full test run under hdfs 
 project, found several test cases could be optimized, so made a simple patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6646) [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and getDatanodeInfo usage is missed

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056083#comment-14056083
 ] 

Hudson commented on HDFS-6646:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-6646. [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and 
getDatanodeInfo usage is missed ( Contributed by Brahma Reddy Battula) 
(vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1609020)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java


 [ HDFS Rolling Upgrade - Shell  ] shutdownDatanode and getDatanodeInfo usage 
 is missed
 --

 Key: HDFS-6646
 URL: https://issues.apache.org/jira/browse/HDFS-6646
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.6.0

 Attachments: HDFS-6646.patch, HDFS-6646_1.patch


 Usage message is missed for shutdownDatanode and getdatanodeinfo
 Please check the following for same..(It's printing whole usage for dfsadmin)
 hdfs dfsadmin -shutdownDatanode
 Usage: java DFSAdmin
 Note: Administrative commands can only be run as the HDFS superuser.
[-report]
[-safemode enter | leave | get | wait]
[-allowSnapshot snapshotDir]
[-disallowSnapshot snapshotDir]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-finalizeUpgrade]
[-rollingUpgrade [query|prepare|finalize]]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-printTopology]
[-refreshNamenodes datanodehost:port]
[-deleteBlockPool datanode-host:port blockpoolId [force]]
[-setQuota quota dirname...dirname]
[-clrQuota dirname...dirname]
[-setSpaceQuota quota dirname...dirname]
[-clrSpaceQuota dirname...dirname]
[-setBalancerBandwidth bandwidth in bytes per second]
[-fetchImage local directory]
[-shutdownDatanode datanode_host:ipc_port [upgrade]]
[-getDatanodeInfo datanode_host:ipc_port]
[-help [cmd]]
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4286) Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056080#comment-14056080
 ] 

Hudson commented on HDFS-4286:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-4286. Changes from BOOKKEEPER-203 broken capability of including 
bookkeeper-server jar in hidden package of BKJM. Contributed by Rakesh R. 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608764)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/pom.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm


 Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server 
 jar in hidden package of BKJM
 --

 Key: HDFS-4286
 URL: https://issues.apache.org/jira/browse/HDFS-4286
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Rakesh R
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-4286.patch, HDFS-4286.patch


 BOOKKEEPER-203 introduced changes to LedgerLayout to include 
 ManagerFactoryClass instead of ManagerFactoryName.
 So because of this, BKJM cannot shade the bookkeeper-server jar inside BKJM 
 jar
 LAYOUT znode created by BookieServer is not readable by the BKJM as it have 
 classes in hidden packages. (same problem vice versa)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4221) Remove the format limitation point from BKJM documentation as HDFS-3810 closed

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056081#comment-14056081
 ] 

Hudson commented on HDFS-4221:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-4221. Remove the format limitation point from BKJM documentation as 
HDFS-3810 closed. Contributed by Rakesh R. (umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608776)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm


 Remove the format limitation point from BKJM documentation as HDFS-3810 closed
 --

 Key: HDFS-4221
 URL: https://issues.apache.org/jira/browse/HDFS-4221
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Uma Maheswara Rao G
Assignee: Rakesh R
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-4221.patch


 Remove the format limitation point from BKJM documentation as HDFS-3810 closed



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5411) Update Bookkeeper dependency to 4.2.3

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056082#comment-14056082
 ] 

Hudson commented on HDFS-5411:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-5411. Update Bookkeeper dependency to 4.2.3. Contributed by Rakesh R. 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608781)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/BKJMUtil.java
* /hadoop/common/trunk/hadoop-project/pom.xml


 Update Bookkeeper dependency to 4.2.3
 -

 Key: HDFS-5411
 URL: https://issues.apache.org/jira/browse/HDFS-5411
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Rakesh R
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-5411.patch, HDFS-5411.patch


 Update the bookkeeper dependency to 4.2.3.  This eases compilation on Fedora 
 platforms



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6627) Rename DataNode#checkWriteAccess to checkReadAccess.

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056078#comment-14056078
 ] 

Hudson commented on HDFS-6627:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-6627. Rename DataNode#checkWriteAccess to checkReadAccess. Contributed by 
Liang Xie. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608940)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Rename DataNode#checkWriteAccess to checkReadAccess.
 

 Key: HDFS-6627
 URL: https://issues.apache.org/jira/browse/HDFS-6627
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Liang Xie
Assignee: Liang Xie
 Fix For: 3.0.0, 2.6.0

 Attachments: HDFS-6627.txt


 Just read getReplicaVisibleLength() code and found it, 
 DataNode.checkWriteAccess is only invoked by 
 DataNode.getReplicaVisibleLength(), let's rename it to checkReadAccess to 
 avoid confusing, since the real impl here is check AccessMode.READ:
 {code}
 blockPoolTokenSecretManager.checkAccess(id, null, block,
 BlockTokenSecretManager.AccessMode.READ);
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-3810) Implement format() for BKJM

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056076#comment-14056076
 ] 

Hudson commented on HDFS-3810:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #608 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/608/])
HDFS-4221. Remove the format limitation point from BKJM documentation as 
HDFS-3810 closed. Contributed by Rakesh R. (umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608776)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm


 Implement format() for BKJM
 ---

 Key: HDFS-3810
 URL: https://issues.apache.org/jira/browse/HDFS-3810
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: HDFS-3810.diff, HDFS-3810.diff, HDFS-3810.diff


 At the moment, formatting for BKJM is done on initialization. Reinitializing 
 is a manual process. This JIRA is to implement the JournalManager#format API, 
 so that BKJM can be formatting along with all other storage methods.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-07-09 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056098#comment-14056098
 ] 

Charles Lamb commented on HDFS-6422:


[~yi.a.liu], [~umamaheswararao],

Given Uma's work on HDFS-6556, I want to clarify what remains to be done on 
this patch. Earlier in the comments, I said:

{quote} 
Throw an exception if:
. the caller requests an attribute that doesn't exist,
. the caller requests an attribute and they don't have proper permissions,
. the caller requests an attribute and they don't have permission to the 
namespace. This applies to the trusted namespace.
. the caller specifies an unknown namespace.

The gist of Linux extended attribute permissions is that you need access to the 
inode to read/write xattr names and you need access to the entity itself (i.e. 
a file or directory) to read/write xattr values. The former is determined by 
the parent directory permissions and the latter by the entity's permissions 
(i.e. the thing on which the extended attributes are associated).

You need scan/execute permissions on the parent (owning) directory to access 
extended attribute names. You need read permission on the entity itself to read 
extended attribute values and you need write permission to modify them.
{quote}

Do you believe you have the permissions checking correct now and that only the 
exception throwing needs to be fixed or is there still more work to be done on 
read permissions? Specifically, that we should be validating xattr name access 
based on scan permission on the inode's owning directory and xattr value access 
based on the inode's permission?


 getfattr in CLI doesn't throw exception or return non-0 return code when 
 xattr doesn't exist
 

 Key: HDFS-6422
 URL: https://issues.apache.org/jira/browse/HDFS-6422
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch


 If you do
 hdfs dfs -getfattr -n user.blah /foo
 and user.blah doesn't exist, the command prints
 # file: /foo
 and a 0 return code.
 It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6630) Unable to fetch the block information by Browsing the file system on Namenode UI through IE9

2014-07-09 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-6630:


Attachment: HDFS-6630.patch

Attached the patch which fixes the issue in IE. 
I have verified the issue in IE-10 of my Windows 8 PC.

Hi [~wheat9], Can you take a look at this if possible...

Thanks in Advance.

 Unable to fetch the block information  by Browsing the file system on 
 Namenode UI through IE9
 -

 Key: HDFS-6630
 URL: https://issues.apache.org/jira/browse/HDFS-6630
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1
Reporter: J.Andreina
 Attachments: HDFS-6630.patch


 On IE9 follow the below steps
  
 NNUI -- Utilities - Browse the File system - click on File name
  
 Instead of displaying the Block information , it displays as 
 {noformat}
 Failed to retreive data from /webhdfs/v1/4?op=GET_BLOCK_LOCATIONS: No 
 Transport 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6614) shorten TestPread run time with a smaller retry timeout setting

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056124#comment-14056124
 ] 

Hudson commented on HDFS-6614:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-6614. Addendum patch to shorten TestPread run time with smaller retry 
timeout setting. Contributed by Liang Xie. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608846)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java


 shorten TestPread run time with a smaller retry timeout setting
 ---

 Key: HDFS-6614
 URL: https://issues.apache.org/jira/browse/HDFS-6614
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0, 2.5.0
Reporter: Liang Xie
Assignee: Liang Xie
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6614-addmium.txt, HDFS-6614.txt


 Just notice logs like this from TestPread:
 DFS chooseDataNode: got # 3 IOException, will wait for 9909.622860072854 msec
 so i tried to set a smaller retry window value.
 Before patch:
  T E S T S
 ---
 Running org.apache.hadoop.hdfs.TestPread
 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.812 sec - 
 in org.apache.hadoop.hdfs.TestPread
 After the change:
  T E S T S
 ---
 Running org.apache.hadoop.hdfs.TestPread
 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 131.724 sec - 
 in org.apache.hadoop.hdfs.TestPread



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6627) Rename DataNode#checkWriteAccess to checkReadAccess.

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056123#comment-14056123
 ] 

Hudson commented on HDFS-6627:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-6627. Rename DataNode#checkWriteAccess to checkReadAccess. Contributed by 
Liang Xie. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608940)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Rename DataNode#checkWriteAccess to checkReadAccess.
 

 Key: HDFS-6627
 URL: https://issues.apache.org/jira/browse/HDFS-6627
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Liang Xie
Assignee: Liang Xie
 Fix For: 3.0.0, 2.6.0

 Attachments: HDFS-6627.txt


 Just read getReplicaVisibleLength() code and found it, 
 DataNode.checkWriteAccess is only invoked by 
 DataNode.getReplicaVisibleLength(), let's rename it to checkReadAccess to 
 avoid confusing, since the real impl here is check AccessMode.READ:
 {code}
 blockPoolTokenSecretManager.checkAccess(id, null, block,
 BlockTokenSecretManager.AccessMode.READ);
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-3810) Implement format() for BKJM

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056120#comment-14056120
 ] 

Hudson commented on HDFS-3810:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-4221. Remove the format limitation point from BKJM documentation as 
HDFS-3810 closed. Contributed by Rakesh R. (umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608776)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm


 Implement format() for BKJM
 ---

 Key: HDFS-3810
 URL: https://issues.apache.org/jira/browse/HDFS-3810
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: HDFS-3810.diff, HDFS-3810.diff, HDFS-3810.diff


 At the moment, formatting for BKJM is done on initialization. Reinitializing 
 is a manual process. This JIRA is to implement the JournalManager#format API, 
 so that BKJM can be formatting along with all other storage methods.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4286) Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056125#comment-14056125
 ] 

Hudson commented on HDFS-4286:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-4286. Changes from BOOKKEEPER-203 broken capability of including 
bookkeeper-server jar in hidden package of BKJM. Contributed by Rakesh R. 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608764)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/pom.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm


 Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server 
 jar in hidden package of BKJM
 --

 Key: HDFS-4286
 URL: https://issues.apache.org/jira/browse/HDFS-4286
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Rakesh R
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-4286.patch, HDFS-4286.patch


 BOOKKEEPER-203 introduced changes to LedgerLayout to include 
 ManagerFactoryClass instead of ManagerFactoryName.
 So because of this, BKJM cannot shade the bookkeeper-server jar inside BKJM 
 jar
 LAYOUT znode created by BookieServer is not readable by the BKJM as it have 
 classes in hidden packages. (same problem vice versa)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6638) shorten test run time with a smaller retry timeout setting

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056121#comment-14056121
 ] 

Hudson commented on HDFS-6638:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-6638. Shorten test run time with a smaller retry timeout setting. 
Contributed by Liang Xie. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608905)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocalLegacy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientReportBadBlock.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestCrcCorruption.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptedTransfer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMissingBlocksAlert.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailoverWithBlockTokensEnabled.java


 shorten test run time with a smaller retry timeout setting
 --

 Key: HDFS-6638
 URL: https://issues.apache.org/jira/browse/HDFS-6638
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0, 2.5.0
Reporter: Liang Xie
Assignee: Liang Xie
 Fix For: 3.0.0, 2.6.0

 Attachments: HDFS-6638.txt


 similiar with HDFS-6614, i think it's a general test duration optimization 
 tip, so i grep IOException, will wait for from a full test run under hdfs 
 project, found several test cases could be optimized, so made a simple patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5411) Update Bookkeeper dependency to 4.2.3

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056127#comment-14056127
 ] 

Hudson commented on HDFS-5411:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-5411. Update Bookkeeper dependency to 4.2.3. Contributed by Rakesh R. 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608781)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/BKJMUtil.java
* /hadoop/common/trunk/hadoop-project/pom.xml


 Update Bookkeeper dependency to 4.2.3
 -

 Key: HDFS-5411
 URL: https://issues.apache.org/jira/browse/HDFS-5411
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Robert Rati
Assignee: Rakesh R
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-5411.patch, HDFS-5411.patch


 Update the bookkeeper dependency to 4.2.3.  This eases compilation on Fedora 
 platforms



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4221) Remove the format limitation point from BKJM documentation as HDFS-3810 closed

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056126#comment-14056126
 ] 

Hudson commented on HDFS-4221:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-4221. Remove the format limitation point from BKJM documentation as 
HDFS-3810 closed. Contributed by Rakesh R. (umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1608776)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithNFS.apt.vm


 Remove the format limitation point from BKJM documentation as HDFS-3810 closed
 --

 Key: HDFS-4221
 URL: https://issues.apache.org/jira/browse/HDFS-4221
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Uma Maheswara Rao G
Assignee: Rakesh R
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-4221.patch


 Remove the format limitation point from BKJM documentation as HDFS-3810 closed



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6646) [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and getDatanodeInfo usage is missed

2014-07-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056128#comment-14056128
 ] 

Hudson commented on HDFS-6646:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1799 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1799/])
HDFS-6646. [ HDFS Rolling Upgrade - Shell ] shutdownDatanode and 
getDatanodeInfo usage is missed ( Contributed by Brahma Reddy Battula) 
(vinayakumarb: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1609020)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java


 [ HDFS Rolling Upgrade - Shell  ] shutdownDatanode and getDatanodeInfo usage 
 is missed
 --

 Key: HDFS-6646
 URL: https://issues.apache.org/jira/browse/HDFS-6646
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.4.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.6.0

 Attachments: HDFS-6646.patch, HDFS-6646_1.patch


 Usage message is missed for shutdownDatanode and getdatanodeinfo
 Please check the following for same..(It's printing whole usage for dfsadmin)
 hdfs dfsadmin -shutdownDatanode
 Usage: java DFSAdmin
 Note: Administrative commands can only be run as the HDFS superuser.
[-report]
[-safemode enter | leave | get | wait]
[-allowSnapshot snapshotDir]
[-disallowSnapshot snapshotDir]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-finalizeUpgrade]
[-rollingUpgrade [query|prepare|finalize]]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-printTopology]
[-refreshNamenodes datanodehost:port]
[-deleteBlockPool datanode-host:port blockpoolId [force]]
[-setQuota quota dirname...dirname]
[-clrQuota dirname...dirname]
[-setSpaceQuota quota dirname...dirname]
[-clrSpaceQuota dirname...dirname]
[-setBalancerBandwidth bandwidth in bytes per second]
[-fetchImage local directory]
[-shutdownDatanode datanode_host:ipc_port [upgrade]]
[-getDatanodeInfo datanode_host:ipc_port]
[-help [cmd]]
 Generic options supported are
 -conf configuration file specify an application configuration file
 -D property=valueuse value for given property
 -fs local|namenode:port  specify a namenode
 -jt local|jobtracker:portspecify a job tracker
 -files comma separated list of filesspecify comma separated files to be 
 copied to the map reduce cluster
 -libjars comma separated list of jarsspecify comma separated jar files 
 to include in the classpath.
 -archives comma separated list of archivesspecify comma separated 
 archives to be unarchived on the compute machines.
 The general command line syntax is
 bin/hadoop command [genericOptions] [commandOptions]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6648) Order of namenodes in ConfiguredFailoverProxyProvider is not defined by order in hdfs-site.xml

2014-07-09 Thread Rafal Wojdyla (JIRA)
Rafal Wojdyla created HDFS-6648:
---

 Summary: Order of namenodes in ConfiguredFailoverProxyProvider is 
not defined by order in hdfs-site.xml
 Key: HDFS-6648
 URL: https://issues.apache.org/jira/browse/HDFS-6648
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, hdfs-client
Affects Versions: 2.2.0
Reporter: Rafal Wojdyla


In org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider, 
in the constructor, there's a map nameservice :  service-id : 
service-rpc-address   (DFSUtil.getHaNnRpcAddresses). It's a LinkedHashMap of 
HashMaps. The order is kept for _nameservices_. Then to find active namenode, 
for nameservice, we get HashMap of service-id : service-rpc-address  for 
requested nameservice (taken from URI request), And for this HashMap we get 
values - order of this collection is not strictly defined! In the code: 

{code}
CollectionInetSocketAddress addressesOfNns = addressesInNN.values(); 
{code}

And then we put these values (in not defined order) into ArrayList of proxies, 
and then in getProxy we start from first proxy in the list and failover to next 
if needed. 

It would make sense for ConfiguredFailoverProxyProvider to keep order of 
proxies/namenodes defined in hdfs-site.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-4265) BKJM doesn't take advantage of speculative reads

2014-07-09 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-4265:
---

Attachment: 003-HDFS-4265.patch

 BKJM doesn't take advantage of speculative reads
 

 Key: HDFS-4265
 URL: https://issues.apache.org/jira/browse/HDFS-4265
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.2.0
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4265.patch, 002-HDFS-4265.patch, 
 003-HDFS-4265.patch


 BookKeeperEditLogInputStream reads entry at a time, so it doesn't take 
 advantage of the speculative read mechanism introduced by BOOKKEEPER-336.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6631) TestPread#testHedgedReadLoopTooManyTimes fails intermittently.

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056270#comment-14056270
 ] 

Hadoop QA commented on HDFS-6631:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654772/HDFS-6631.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7301//console

This message is automatically generated.

 TestPread#testHedgedReadLoopTooManyTimes fails intermittently.
 --

 Key: HDFS-6631
 URL: https://issues.apache.org/jira/browse/HDFS-6631
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, test
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
Assignee: Liang Xie
 Attachments: HDFS-6631.txt, 
 org.apache.hadoop.hdfs.TestPread-output.txt


 {{TestPread#testHedgedReadLoopTooManyTimes}} fails intermittently.  It looks 
 like a race condition on counting the expected number of loop iterations.  I 
 can repro the test failure more consistently on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


webhdfs kerberos not working with multiple users

2014-07-09 Thread Valluri, Sathish
Hi,

 

We are facing issue with multiple crendentials present in the Kerberos
credential cache and when other users trying to connect curl fails and
throwing expecting only the user from the primary cache.

We have 2 different principals each attached to the same realm and when
trying to connect using the curl, it always loading the primary cache and
not searching for other credentials in the cache and failing.

 

klist -A output snippet showing 2 different credentials, 

 

Ticket cache: DIR::/etc/netwitness/wc_cache_dir/tktSQ8abu

Default principal:  javascript:void(0); gpad...@example.com

 

Valid starting ExpiresService principal

07/09/14 18:31:12  07/10/14 18:22:55  krbtgt/ javascript:void(0);
example@example.com

renew until 07/09/14 18:31:12

 

Ticket cache: DIR::/etc/netwitness/wc_cache_dir/tktEJgnPE

Default principal: hdfs/ javascript:void(0); pivhdsne.krb...@example.com

 

Valid starting ExpiresService principal

07/09/14 18:30:54  07/10/14 18:22:38  krbtgt/ javascript:void(0);
example@example.com

renew until 07/09/14 18:30:54

 

Here our cache has 2 users gpadmin and hdfs, when user tries to connect with
gpadmin user curl works fine and when user switches to hdfs curl fails with
error. Is there any way to provide the username parameter in the curl
negotiate, even though we are proving the users in the -u hdfs: it's not
considering the curl user and authentication fails.

 

curl -i --negotiate  -u hdfs: 
http://www.rediffmail.com/cgi-bin/red.cgi?red=http%3A%2F%2F10.31.251.254%3A
50070%2Fwebhdfs%2Fv1%2F%3Fuser.name%3Dhdfs%26amp%3Bop%3DLISTSTATUS%22isImag
e=0BlockImage=0rediffng=0rogue=7463cc5314a72bb6a967958fd283c6f87beafc96
http://10.31.251.254:50070/webhdfs/v1/?user.name=hdfsop=LISTSTATUS;

HTTP/1.1 401 

Date: Wed, 09 Jul 2014 13:19:56 GMT

Pragma: no-cache

Date: Wed, 09 Jul 2014 13:19:56 GMT

Pragma: no-cache

WWW-Authenticate: Negotiate

Set-Cookie: hadoop.auth=;Path=/;Expires=Thu, 01-Jan-1970 00:00:00 GMT

Content-Type: text/html;charset=ISO-8859-1

Cache-Control: must-revalidate,no-cache,no-store

Content-Length: 1358

Server: Jetty(7.6.10.v20130312)

 

HTTP/1.1 401 Unauthorized

Date: Wed, 09 Jul 2014 13:19:56 GMT

Pragma: no-cache

Cache-Control: no-cache

Date: Wed, 09 Jul 2014 13:19:56 GMT

Pragma: no-cache

Set-Cookie: hadoop.auth=u=gpadminp= javascript:void(0);
gpad...@example.comt=kerberose=1404947996223s=KfBg3KDnhd5dxYvHMUYmDPqdEy4
=;Path=/

Expires: Thu, 01 Jan 1970 00:00:00 GMT

Content-Type: application/json

Transfer-Encoding: chunked

Server: Jetty(7.6.10.v20130312)

 

{RemoteException:{exception:SecurityException,javaClassName:java.la
ng.SecurityException,message:Failed to obtain user group information:
java.io.IOException: Usernames not matched: name=hdfs != expected=gpadmin}}

 

Can anyone suggest how to make curl library to scan kerberos directory cache
and load the proper principal for the particular user.

Are there any options required in the webhdfs front for support multiple
users with Kerberos.

 

Regards

Sathish Valluri



smime.p7s
Description: S/MIME cryptographic signature


[jira] [Commented] (HDFS-4265) BKJM doesn't take advantage of speculative reads

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056290#comment-14056290
 ] 

Hadoop QA commented on HDFS-4265:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654804/003-HDFS-4265.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7302//console

This message is automatically generated.

 BKJM doesn't take advantage of speculative reads
 

 Key: HDFS-4265
 URL: https://issues.apache.org/jira/browse/HDFS-4265
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.2.0
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4265.patch, 002-HDFS-4265.patch, 
 003-HDFS-4265.patch


 BookKeeperEditLogInputStream reads entry at a time, so it doesn't take 
 advantage of the speculative read mechanism introduced by BOOKKEEPER-336.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-4266) BKJM: Separate write and ack quorum

2014-07-09 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-4266:
---

Attachment: 002-HDFS-4266.patch

 BKJM: Separate write and ack quorum
 ---

 Key: HDFS-4266
 URL: https://issues.apache.org/jira/browse/HDFS-4266
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4266.patch, 002-HDFS-4266.patch


 BOOKKEEPER-208 allows the ack and write quorums to be different sizes to 
 allow writes to be unaffected by any bookie failure. BKJM should be able to 
 take advantage of this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-4266) BKJM: Separate write and ack quorum

2014-07-09 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-4266:
---

Status: Patch Available  (was: Open)

 BKJM: Separate write and ack quorum
 ---

 Key: HDFS-4266
 URL: https://issues.apache.org/jira/browse/HDFS-4266
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4266.patch, 002-HDFS-4266.patch


 BOOKKEEPER-208 allows the ack and write quorums to be different sizes to 
 allow writes to be unaffected by any bookie failure. BKJM should be able to 
 take advantage of this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4266) BKJM: Separate write and ack quorum

2014-07-09 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056313#comment-14056313
 ] 

Rakesh R commented on HDFS-4266:


Thanks [~ikelly] for the review.

bq.The patch makes an ack quorum mandatory. This breaks existing configs. The 
ack quorum should default to the write quorum, if the configuration is missing.
Hdfs configuration will return the 'quorumSize' if the configuration is missing.
{code}
ackQuorumSize = conf.getInt(BKJM_BOOKKEEPER_ACK_QUORUM_SIZE, quorumSize);
{code}

I've included test case to verify this behavior. Could you have a look at the 
latest patch.Thanks!

 BKJM: Separate write and ack quorum
 ---

 Key: HDFS-4266
 URL: https://issues.apache.org/jira/browse/HDFS-4266
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4266.patch, 002-HDFS-4266.patch


 BOOKKEEPER-208 allows the ack and write quorums to be different sizes to 
 allow writes to be unaffected by any bookie failure. BKJM should be able to 
 take advantage of this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4266) BKJM: Separate write and ack quorum

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056326#comment-14056326
 ] 

Hadoop QA commented on HDFS-4266:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654811/002-HDFS-4266.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7303//console

This message is automatically generated.

 BKJM: Separate write and ack quorum
 ---

 Key: HDFS-4266
 URL: https://issues.apache.org/jira/browse/HDFS-4266
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4266.patch, 002-HDFS-4266.patch


 BOOKKEEPER-208 allows the ack and write quorums to be different sizes to 
 allow writes to be unaffected by any bookie failure. BKJM should be able to 
 take advantage of this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6645:
--

Status: Patch Available  (was: Open)

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6645:
--

Status: Open  (was: Patch Available)

Thank you, [~ajisakaa] and [~jingzhao]! I am going to cancel and re-submit 
patch. The Hadoop QA jenkins job didn't run properly because of the svn upgrade 
issue.

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6645:
--

Attachment: HDFS-6645.001.patch

Reattach the same patch to trigger Hadoop QA jenkins job.

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch, HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056362#comment-14056362
 ] 

Hadoop QA commented on HDFS-6645:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654816/HDFS-6645.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7304//console

This message is automatically generated.

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch, HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6649) Documentation for setrep is wrong

2014-07-09 Thread Alexander Fahlke (JIRA)
Alexander Fahlke created HDFS-6649:
--

 Summary: Documentation for setrep is wrong
 Key: HDFS-6649
 URL: https://issues.apache.org/jira/browse/HDFS-6649
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.0.4
Reporter: Alexander Fahlke
Priority: Trivial


The documentation in: 
http://hadoop.apache.org/docs/r1.0.4/file_system_shell.html#setrep states that 
one must use the command as follows:

- {{Usage: hdfs dfs -setrep [-R] path}}
- {{Example: hdfs dfs -setrep -w 3 -R /user/hadoop/dir1}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6649) Documentation for setrep is wrong

2014-07-09 Thread Alexander Fahlke (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fahlke updated HDFS-6649:
---

Description: 
The documentation in: 
http://hadoop.apache.org/docs/r1.0.4/file_system_shell.html#setrep states that 
one must use the command as follows:

- {{Usage: hdfs dfs -setrep [-R] path}}
- {{Example: hdfs dfs -setrep -w 3 -R /user/hadoop/dir1}}

Correct would be to state that setrep needs the replication factor and the 
replication factor needs to be right before the DFS path.
Must look like this:

- {{Usage: hdfs dfs -setrep [-R] [-w] rep path/file}}
- {{Example: hdfs dfs -setrep -w -R 3 /user/hadoop/dir1}}

  was:
The documentation in: 
http://hadoop.apache.org/docs/r1.0.4/file_system_shell.html#setrep states that 
one must use the command as follows:

- {{Usage: hdfs dfs -setrep [-R] path}}
- {{Example: hdfs dfs -setrep -w 3 -R /user/hadoop/dir1}}


 Documentation for setrep is wrong
 -

 Key: HDFS-6649
 URL: https://issues.apache.org/jira/browse/HDFS-6649
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 1.0.4
Reporter: Alexander Fahlke
Priority: Trivial

 The documentation in: 
 http://hadoop.apache.org/docs/r1.0.4/file_system_shell.html#setrep states 
 that one must use the command as follows:
 - {{Usage: hdfs dfs -setrep [-R] path}}
 - {{Example: hdfs dfs -setrep -w 3 -R /user/hadoop/dir1}}
 Correct would be to state that setrep needs the replication factor and the 
 replication factor needs to be right before the DFS path.
 Must look like this:
 - {{Usage: hdfs dfs -setrep [-R] [-w] rep path/file}}
 - {{Example: hdfs dfs -setrep -w -R 3 /user/hadoop/dir1}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056372#comment-14056372
 ] 

Stephen Chu commented on HDFS-6645:
---

I think the above is not a problem with the patch, but a Hadoop QA/jenkins 
issue that is hitting other PreCommit-HDFS-Builds. Will look into it.

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch, HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6622) Rename and AddBlock may race and produce invalid edits

2014-07-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6622:
-

Attachment: HDFS-6622.v2.patch

 Rename and AddBlock may race and produce invalid edits
 --

 Key: HDFS-6622
 URL: https://issues.apache.org/jira/browse/HDFS-6622
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6622.patch, HDFS-6622.v2.patch


 While investigating HDFS-6618, we have discovered that rename happening in 
 the middle of {{getAdditionalBlock()}} can lead to logging of invalid edit 
 entry.
 In  {{getAdditionalBlock()}} , the path is resolved once while holding the 
 read lock and the same resolved path will be used in the edit log in the 
 second half of the method holding the write lock.  If a rename happens in 
 between two locks, the path may no longer exist. 
 When replaying the {{AddBlockOp}}, it will fail with FileNotFound.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HDFS-6622) Rename and AddBlock may race and produce invalid edits

2014-07-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HDFS-6622:


Assignee: Kihwal Lee

 Rename and AddBlock may race and produce invalid edits
 --

 Key: HDFS-6622
 URL: https://issues.apache.org/jira/browse/HDFS-6622
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6622.patch, HDFS-6622.v2.patch


 While investigating HDFS-6618, we have discovered that rename happening in 
 the middle of {{getAdditionalBlock()}} can lead to logging of invalid edit 
 entry.
 In  {{getAdditionalBlock()}} , the path is resolved once while holding the 
 read lock and the same resolved path will be used in the edit log in the 
 second half of the method holding the write lock.  If a rename happens in 
 between two locks, the path may no longer exist. 
 When replaying the {{AddBlockOp}}, it will fail with FileNotFound.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056451#comment-14056451
 ] 

Kihwal Lee commented on HDFS-6618:
--

Ok, I will take the simple approach.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, HDFS-6618.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4265) BKJM doesn't take advantage of speculative reads

2014-07-09 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056454#comment-14056454
 ] 

Rakesh R commented on HDFS-4265:


Attached new patch addressing [~ikelly]'s comments. Please review the patch. 
Thanks!

It looks like jenkins report is not proper. 

 BKJM doesn't take advantage of speculative reads
 

 Key: HDFS-4265
 URL: https://issues.apache.org/jira/browse/HDFS-4265
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.2.0
Reporter: Ivan Kelly
Assignee: Rakesh R
 Attachments: 001-HDFS-4265.patch, 002-HDFS-4265.patch, 
 003-HDFS-4265.patch


 BookKeeperEditLogInputStream reads entry at a time, so it doesn't take 
 advantage of the speculative read mechanism introduced by BOOKKEEPER-336.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6622) Rename and AddBlock may race and produce invalid edits

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056449#comment-14056449
 ] 

Kihwal Lee commented on HDFS-6622:
--

The reason I added the strict check was to prevent incorrect operations based 
on potentially incorrect result from getFullPathName(). If the inode's parent 
is not null (stale), but one of the ancestor's parent is null, it will assume 
that inode is directly under /. This could happen with the delayed inode 
removal.  But since we are going to remove inodes from inodeMap while holding 
FSNamesystem write lock, this should not happen.  So what you suggest will be 
sufficient.

I also wanted to reduce the number of times getFullPathName() is called. I will 
simply remove the comparison and fix the test to check the correctness of edit.


 Rename and AddBlock may race and produce invalid edits
 --

 Key: HDFS-6622
 URL: https://issues.apache.org/jira/browse/HDFS-6622
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6622.patch, HDFS-6622.v2.patch


 While investigating HDFS-6618, we have discovered that rename happening in 
 the middle of {{getAdditionalBlock()}} can lead to logging of invalid edit 
 entry.
 In  {{getAdditionalBlock()}} , the path is resolved once while holding the 
 read lock and the same resolved path will be used in the edit log in the 
 second half of the method holding the write lock.  If a rename happens in 
 between two locks, the path may no longer exist. 
 When replaying the {{AddBlockOp}}, it will fail with FileNotFound.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6622) Rename and AddBlock may race and produce invalid edits

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056469#comment-14056469
 ] 

Hadoop QA commented on HDFS-6622:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654832/HDFS-6622.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7305//console

This message is automatically generated.

 Rename and AddBlock may race and produce invalid edits
 --

 Key: HDFS-6622
 URL: https://issues.apache.org/jira/browse/HDFS-6622
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6622.patch, HDFS-6622.v2.patch


 While investigating HDFS-6618, we have discovered that rename happening in 
 the middle of {{getAdditionalBlock()}} can lead to logging of invalid edit 
 entry.
 In  {{getAdditionalBlock()}} , the path is resolved once while holding the 
 read lock and the same resolved path will be used in the edit log in the 
 second half of the method holding the write lock.  If a rename happens in 
 between two locks, the path may no longer exist. 
 When replaying the {{AddBlockOp}}, it will fail with FileNotFound.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6618:
-

Attachment: HDFS-6618.simpler.patch

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2014-07-09 Thread James Thomas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056474#comment-14056474
 ] 

James Thomas commented on HDFS-6634:


That would require this project to be a third-party service, much like the work 
presented in http://www.youtube.com/watch?v=7KumMKqBtr8

So I think the same concerns raised in the Q  A (starting around 25:30) would 
apply here -- in particular, changes to the format of the audit log would cause 
problems for user applications. I think tighter integration with HDFS and 
exposure of the edits in ProtoBuf form is important.

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Attachments: inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056477#comment-14056477
 ] 

Kihwal Lee commented on HDFS-6618:
--

With the new patch {{removedINodes}} is passed to 
{{FSNamesystem#removePathAndBlocks()}} while in the write lock. The method was 
modified to conditionally acquire the directory lock.  I didn't move the 
removal to {{FSDirectory}}, since we may want to do something with the inodes 
in {{FSNamesystem}} later as a part of failure handling in a separate jira.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056481#comment-14056481
 ] 

Jing Zhao commented on HDFS-6618:
-

Thanks [~kihwal]. The patch looks good to me. +1 pending Jenkins.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6650) API to get the root of an encryption zone for a path

2014-07-09 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-6650:
-

 Summary: API to get the root of an encryption zone for a path
 Key: HDFS-6650
 URL: https://issues.apache.org/jira/browse/HDFS-6650
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Andrew Wang
Assignee: Andrew Wang


It'd be useful to be able to query, given a path within an encryption zone, the 
root of the encryption zone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056486#comment-14056486
 ] 

Hadoop QA commented on HDFS-6618:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12654834/HDFS-6618.simpler.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7306//console

This message is automatically generated.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6651) Deletion failure can leak inodes permanently.

2014-07-09 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-6651:


 Summary: Deletion failure can leak inodes permanently.
 Key: HDFS-6651
 URL: https://issues.apache.org/jira/browse/HDFS-6651
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Critical


As discussed in HDFS-6618, if a deletion of tree fails in the middle, any 
collected inodes and blocks will not be removed from {{INodeMap}} and 
{{BlocksMap}}. 

Since fsimage is saved by iterating over {{INodeMap}}, the leak will persist 
across name node restart. Although blanked out inodes will not have reference 
to blocks, blocks will still refer to the inode as {{BlockCollection}}. As long 
as it is not null, blocks will live on. The leaked blocks from blanked out 
inodes will go away after restart.

Options (when delete fails in the middle)
- Complete the partial delete: edit log the partial delete and remove inodes 
and blocks. 
- Somehow undo the partial delete.
- Check quota for snapshot diff beforehand for the whole subtree.
- Ignore quota check during delete even if snapshot is present.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056499#comment-14056499
 ] 

Kihwal Lee commented on HDFS-6618:
--

Filed HDFS-6651 for the leak problem.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056512#comment-14056512
 ] 

Kihwal Lee commented on HDFS-6618:
--

The build failed because of this. 
{panel}
 [exec] CMake Error at 
/usr/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:108 (message):
 [exec]   Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
 [exec] Call Stack (most recent call first):
 [exec]   
/usr/share/cmake-2.8/Modules/FindPackageHandleStandardArgs.cmake:315 
(_FPHSA_FAILURE_MESSAGE)
 [exec]   /usr/share/cmake-2.8/Modules/FindPkgConfig.cmake:106 
(find_package_handle_standard_args)
 [exec]   main/native/fuse-dfs/CMakeLists.txt:23 (find_package)
{panel}



 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056517#comment-14056517
 ] 

Colin Patrick McCabe commented on HDFS-6647:


The simplest thing is probably just to have {{updatePipeline}} throw an 
exception if the file doesn't exist (or exists only in snapshots).

bq. It shouldn't be difficult to change the FSEditLogLoader to be able to read 
the OP_UPDATE_BLOCKS op if we just change it to look up the INode by block ID.

We could do that when recovery mode is on.  I don't think we want to do that 
normally since snapshotted blocks are not supposed to be mutable

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056523#comment-14056523
 ] 

Jing Zhao commented on HDFS-6647:
-

In HDFS-6527 we do not allow users to get an additional block if the file has 
been deleted (but can be in a snapshot). Maybe here we should also fail the 
{{updatePipeline}} call to make it consistent?

But in the meanwhile, I think in the future it will be better to weaken the 
dependency between the states of blocks and files, e.g., letting RPC calls like 
{{updatePipeline}} only update and check the state of blocks. This can make 
work like separating block management out as a service (HDFS-5477) easier.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-172) Quota exceed exception creates file of size 0

2014-07-09 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze resolved HDFS-172.
--

Resolution: Not a Problem

Resolve this as not-a-problem.  Please feel free to reopen if you disagree.

 Quota exceed exception creates file of size 0
 -

 Key: HDFS-172
 URL: https://issues.apache.org/jira/browse/HDFS-172
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravi Phulari

 Empty file of size 0 is created when QuotaExceed exception occurs while 
 copying a file. This file is created with the same name of which file copy is 
 tried .
 I.E if operation 
 Hadoop fs -copyFromLocal testFile1 /testDir   
 Fails due to quota exceed exception then testFile1 of size 0 is created in 
 testDir on HDFS.
 Steps to verify 
 1) Create testDir and apply space quota of 16kb
 2) Copy file say testFile of size greater than 16kb from local file system
 3) You should see QuotaException error 
 4) testFile of size 0 is created in testDir which is not expected .



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6622) Rename and AddBlock may race and produce invalid edits

2014-07-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056596#comment-14056596
 ] 

Colin Patrick McCabe commented on HDFS-6622:


bq. I also wanted to reduce the number of times getFullPathName() is called. I 
will simply remove the comparison and fix the test to check the correctness of 
edit.

OK.  From what I can see, the path should be recomputed while under the lock 
(rather than simply trusting that it will stay the same since we last released 
the lock).  That should fix things.  It looks like you introduced the FileState 
object in order to avoid calling getFullPathName() twice while holding the 
lock... fair enough.

+1

 Rename and AddBlock may race and produce invalid edits
 --

 Key: HDFS-6622
 URL: https://issues.apache.org/jira/browse/HDFS-6622
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6622.patch, HDFS-6622.v2.patch


 While investigating HDFS-6618, we have discovered that rename happening in 
 the middle of {{getAdditionalBlock()}} can lead to logging of invalid edit 
 entry.
 In  {{getAdditionalBlock()}} , the path is resolved once while holding the 
 read lock and the same resolved path will be used in the edit log in the 
 second half of the method holding the write lock.  If a rename happens in 
 between two locks, the path may no longer exist. 
 When replaying the {{AddBlockOp}}, it will fail with FileNotFound.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6621) Hadoop Balancer prematurely exits iterations

2014-07-09 Thread Benjamin Bowman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bowman updated HDFS-6621:
--

Status: Patch Available  (was: Open)

 Hadoop Balancer prematurely exits iterations
 

 Key: HDFS-6621
 URL: https://issues.apache.org/jira/browse/HDFS-6621
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.4.0, 2.2.0
 Environment: Red Hat Enterprise Linux Server release 5.8 with Hadoop 
 2.4.0
Reporter: Benjamin Bowman
  Labels: balancer
 Attachments: HDFS-6621.patch


 I have been having an issue with the balancing being too slow.  The issue was 
 not with the speed with which blocks were moved, but rather the balancer 
 would prematurely exit out of it's balancing iterations.  It would move ~10 
 blocks or 100 MB then exit the current iteration (in which it said it was 
 planning on moving about 10 GB). 
 I looked in the Balancer.java code and believe I found and solved the issue.  
 In the dispatchBlocks() function there is a variable, 
 noPendingBlockIteration, which counts the number of iterations in which a 
 pending block to move cannot be found.  Once this number gets to 5, the 
 balancer exits the overall balancing iteration.  I believe the desired 
 functionality is 5 consecutive no pending block iterations - however this 
 variable is never reset to 0 upon block moves.  So once this number reaches 5 
 - even if there have been thousands of blocks moved in between these no 
 pending block iterations  - the overall balancing iteration will prematurely 
 end.  
 The fix I applied was to set noPendingBlockIteration = 0 when a pending block 
 is found and scheduled.  In this way, my iterations do not prematurely exit 
 unless there is 5 consecutive no pending block iterations.   Below is a copy 
 of my dispatchBlocks() function with the change I made.
 private void dispatchBlocks() {
   long startTime = Time.now();
   long scheduledSize = getScheduledSize();
   this.blocksToReceive = 2*scheduledSize;
   boolean isTimeUp = false;
   int noPendingBlockIteration = 0;
   while(!isTimeUp  getScheduledSize()0 
   (!srcBlockList.isEmpty() || blocksToReceive0)) {
 PendingBlockMove pendingBlock = chooseNextBlockToMove();
 if (pendingBlock != null) {
   noPendingBlockIteration = 0;
   // move the block
   pendingBlock.scheduleBlockMove();
   continue;
 }
 /* Since we can not schedule any block to move,
  * filter any moved blocks from the source block list and
  * check if we should fetch more blocks from the namenode
  */
 filterMovedBlocks(); // filter already moved blocks
 if (shouldFetchMoreBlocks()) {
   // fetch new blocks
   try {
 blocksToReceive -= getBlockList();
 continue;
   } catch (IOException e) {
 LOG.warn(Exception while getting block list, e);
 return;
   }
 } else {
   // source node cannot find a pendingBlockToMove, iteration +1
   noPendingBlockIteration++;
   // in case no blocks can be moved for source node's task,
   // jump out of while-loop after 5 iterations.
   if (noPendingBlockIteration = MAX_NO_PENDING_BLOCK_ITERATIONS) {
 setScheduledSize(0);
   }
 }
 // check if time is up or not
 if (Time.now()-startTime  MAX_ITERATION_TIME) {
   isTimeUp = true;
   continue;
 }
 /* Now we can not schedule any block to move and there are
  * no new blocks added to the source block list, so we wait.
  */
 try {
   synchronized(Balancer.this) {
 Balancer.this.wait(1000);  // wait for targets/sources to be idle
   }
 } catch (InterruptedException ignored) {
 }
   }
 }
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056608#comment-14056608
 ] 

Colin Patrick McCabe commented on HDFS-6618:


It looks like someone is updating the build slaves, and somehow pkg-config got 
uninstalled?  I will kick the build again.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056610#comment-14056610
 ] 

Colin Patrick McCabe commented on HDFS-6618:


+1 for the patch.

Just one small note... I'd prefer to see lock... unlock blocks around 
removePathAndBlocks when appropriate, rather than a boolean lock me passed 
in, but we can address that in the refactoring, I guess

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6469) Coordinated replication of the namespace using ConsensusNode

2014-07-09 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056634#comment-14056634
 ] 

Konstantin Shvachko commented on HDFS-6469:
---

You are right, on the client hflush() calls NN's fsync() once in the beginning 
of each block, because it does not update block's length. Thank you for the 
correction, Nicholas.

 Coordinated replication of the namespace using ConsensusNode
 

 Key: HDFS-6469
 URL: https://issues.apache.org/jira/browse/HDFS-6469
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: namenode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Attachments: CNodeDesign.pdf


 This is a proposal to introduce ConsensusNode - an evolution of the NameNode, 
 which enables replication of the namespace on multiple nodes of an HDFS 
 cluster by means of a Coordination Engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6634) inotify in HDFS

2014-07-09 Thread James Thomas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6634:
---

Attachment: inotify-intro.2.pdf

Updated the design doc in response to Andrew's comments. I think we can start 
by exposing the entire edits stream to just superusers.

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Attachments: inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-07-09 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056664#comment-14056664
 ] 

Uma Maheswara Rao G commented on HDFS-6422:
---

Thanks for summarizing pending things.
{quote}
 that we should be validating xattr name access based on scan permission on the 
inode's owning directory and xattr value access based on the inode's permission?
{quote}
For writing Xattrs, the current permissions covered. From your comment in list 
Xattrs, we may need owner's check as well I think.
{code}
 /* To access xattr names, you need EXECUTE in the owning directory. */
 checkParentAccess(pc, src, FsAction.EXECUTE);
{code}
current check validates only execute permissions on parent dir. But it is not 
caring whether you are owner for the current directory or not. What do you say?
For getXattrs, it will actually get the values there and it has pathaccess 
check on inode. So, this should be fine. SetXattrs, RemoveXattrs treated as 
writing xattrs and covered the permission checks appropriately as documented 
for namespace categories. 

 getfattr in CLI doesn't throw exception or return non-0 return code when 
 xattr doesn't exist
 

 Key: HDFS-6422
 URL: https://issues.apache.org/jira/browse/HDFS-6422
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch


 If you do
 hdfs dfs -getfattr -n user.blah /foo
 and user.blah doesn't exist, the command prints
 # file: /foo
 and a 0 return code.
 It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2014-07-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056708#comment-14056708
 ] 

Steve Loughran commented on HDFS-6634:
--

I saw that talk, though it [was this 
one|https://www.youtube.com/watch?v=XZWwwc-qeJoindex=35list=PLSAiKuajRe2kIxG-WKmTOZNlDpSgYwEBs]
 that I was thinking of.


 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Attachments: inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-6650) API to get the root of an encryption zone for a path

2014-07-09 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb resolved HDFS-6650.


Resolution: Duplicate

Duplicate of HDFS-6546.


 API to get the root of an encryption zone for a path
 

 Key: HDFS-6650
 URL: https://issues.apache.org/jira/browse/HDFS-6650
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Andrew Wang
Assignee: Andrew Wang

 It'd be useful to be able to query, given a path within an encryption zone, 
 the root of the encryption zone.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5202) umbrella JIRA for Windows support in HDFS caching

2014-07-09 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5202:


  Component/s: datanode
 Target Version/s: 3.0.0, 2.6.0  (was: HDFS-4949)
Affects Version/s: 2.5.0
   3.0.0
 Assignee: Chris Nauroth

 umbrella JIRA for Windows support in HDFS caching
 -

 Key: HDFS-5202
 URL: https://issues.apache.org/jira/browse/HDFS-5202
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Chris Nauroth

 This is an umbrella JIRA for adding Windows support for HDFS caching.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5202) Support Centralized Cache Management on Windows.

2014-07-09 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5202:


Description: HDFS caching currently is implemented using POSIX syscalls for 
checking ulimit and locking pages of memory into the process's address space.  
These POSIX syscalls do not exist on Windows.  This issue will implement 
equivalent functionality so that Windows deployments can use Centralized Cache 
Management.  (was: This is an umbrella JIRA for adding Windows support for HDFS 
caching.)
Summary: Support Centralized Cache Management on Windows.  (was: 
umbrella JIRA for Windows support in HDFS caching)

I've changed the summary and description to remove the word umbrella.  This 
patch is actually going to be quite small, and the word umbrella just seemed 
ominous.  :-)

 Support Centralized Cache Management on Windows.
 

 Key: HDFS-5202
 URL: https://issues.apache.org/jira/browse/HDFS-5202
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Chris Nauroth

 HDFS caching currently is implemented using POSIX syscalls for checking 
 ulimit and locking pages of memory into the process's address space.  These 
 POSIX syscalls do not exist on Windows.  This issue will implement equivalent 
 functionality so that Windows deployments can use Centralized Cache 
 Management.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-07-09 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056772#comment-14056772
 ] 

Charles Lamb commented on HDFS-6422:


Thanks [~umamaheswararao].

So I'll add code to throw exceptions as previously specified and checking the 
owner for listXAttrs.


 getfattr in CLI doesn't throw exception or return non-0 return code when 
 xattr doesn't exist
 

 Key: HDFS-6422
 URL: https://issues.apache.org/jira/browse/HDFS-6422
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch


 If you do
 hdfs dfs -getfattr -n user.blah /foo
 and user.blah doesn't exist, the command prints
 # file: /foo
 and a 0 return code.
 It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056773#comment-14056773
 ] 

Kihwal Lee commented on HDFS-6647:
--

It is already checking if the file is deleted. It's just that the check is 
incomplete.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6621) Hadoop Balancer prematurely exits iterations

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056774#comment-14056774
 ] 

Hadoop QA commented on HDFS-6621:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653835/HDFS-6621.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7307//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7307//console

This message is automatically generated.

 Hadoop Balancer prematurely exits iterations
 

 Key: HDFS-6621
 URL: https://issues.apache.org/jira/browse/HDFS-6621
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.2.0, 2.4.0
 Environment: Red Hat Enterprise Linux Server release 5.8 with Hadoop 
 2.4.0
Reporter: Benjamin Bowman
  Labels: balancer
 Attachments: HDFS-6621.patch


 I have been having an issue with the balancing being too slow.  The issue was 
 not with the speed with which blocks were moved, but rather the balancer 
 would prematurely exit out of it's balancing iterations.  It would move ~10 
 blocks or 100 MB then exit the current iteration (in which it said it was 
 planning on moving about 10 GB). 
 I looked in the Balancer.java code and believe I found and solved the issue.  
 In the dispatchBlocks() function there is a variable, 
 noPendingBlockIteration, which counts the number of iterations in which a 
 pending block to move cannot be found.  Once this number gets to 5, the 
 balancer exits the overall balancing iteration.  I believe the desired 
 functionality is 5 consecutive no pending block iterations - however this 
 variable is never reset to 0 upon block moves.  So once this number reaches 5 
 - even if there have been thousands of blocks moved in between these no 
 pending block iterations  - the overall balancing iteration will prematurely 
 end.  
 The fix I applied was to set noPendingBlockIteration = 0 when a pending block 
 is found and scheduled.  In this way, my iterations do not prematurely exit 
 unless there is 5 consecutive no pending block iterations.   Below is a copy 
 of my dispatchBlocks() function with the change I made.
 private void dispatchBlocks() {
   long startTime = Time.now();
   long scheduledSize = getScheduledSize();
   this.blocksToReceive = 2*scheduledSize;
   boolean isTimeUp = false;
   int noPendingBlockIteration = 0;
   while(!isTimeUp  getScheduledSize()0 
   (!srcBlockList.isEmpty() || blocksToReceive0)) {
 PendingBlockMove pendingBlock = chooseNextBlockToMove();
 if (pendingBlock != null) {
   noPendingBlockIteration = 0;
   // move the block
   pendingBlock.scheduleBlockMove();
   continue;
 }
 /* Since we can not schedule any block to move,
  * filter any moved blocks from the source block list and
  * check if we should fetch more blocks from the namenode
  */
 filterMovedBlocks(); // filter already moved blocks
 if (shouldFetchMoreBlocks()) {
   // fetch new blocks
   try {
 blocksToReceive -= getBlockList();
 continue;
   } catch (IOException e) {
 LOG.warn(Exception while getting block list, e);
 return;
   }
 } else {
   // source node cannot find a pendingBlockToMove, iteration +1
   noPendingBlockIteration++;
   // in case no blocks can be moved for source node's task,
   // jump out of while-loop after 5 iterations.
   if (noPendingBlockIteration = 

[jira] [Updated] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6647:
-

Attachment: HDFS-6647.patch

The patch adds {{isFileDeleted()}} method. This depends to HDFS-6618. I also 
made checkLease() call this method. The Aaron's test case has been slightly 
modified. 

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056783#comment-14056783
 ] 

Kihwal Lee commented on HDFS-6647:
--

Marking HDFS-6618 as a dependency. I won't submit the patch until it is 
committed.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5202) Support Centralized Cache Management on Windows.

2014-07-09 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5202:


Attachment: HDFS-5202.1.patch

The attached patch gets DataNode caching working on Windows.  This mixes 
changes in Common and HDFS.  I can spin off a separate HADOOP jira for the 
Common changes after this gets reviewed and approved.

This is actually pretty simple stuff.  We just need to swap out the POSIX 
syscalls for some Windows specifics.  The relevant Windows syscalls are:

* 
[VirtualLock|http://msdn.microsoft.com/en-us/library/windows/desktop/aa366895(v=vs.85).aspx]
* 
[GetCurrentProcess|http://msdn.microsoft.com/en-us/library/windows/desktop/ms683179(v=vs.85).aspx]
* 
[GetProcessWorkingSetSize|http://msdn.microsoft.com/en-us/library/windows/desktop/ms683226(v=vs.85).aspx]
* 
[SetProcessWorkingSetSizeEx|http://msdn.microsoft.com/en-us/library/windows/desktop/ms686237(v=vs.85).aspx]

Summary of changes:
# {{NativeIO}}: I added {{extendWorkingSetSize}}, which is a new Windows-only 
JNI method that extends the minimum and maximum working set size of a Windows 
process.  Ultimately, this is what governs how much memory a Windows process is 
allowed to lock.  Full details are in the MSDN links above.  I also implemented 
{{mlock_1native}} to call {{VirtualLock}} on Windows.
# {{hdfs.cmd}}: I added cacheadmin to the supported commands on Windows.
# {{DataNode}}: Windows does not have a direct equivalent of {{ulimit -l}}.  
Instead of looking for a ulimit and enforcing that our configuration doesn't 
exceed it, we attempt to extend the working set size when running on Windows.
# {{CentralizedCacheManagement.apt.vm}}: I updated the documentation with a few 
clarifications about how it works on Windows.
# {{TestFsDatasetCache}}: We no longer need to skip this test suite on Windows. 
 The tests had a few file descriptor leaks that caused test failures on 
Windows, so I fixed that.

In addition to running the JUnit tests, I ran manual tests.  I used 
Systeinternals VMMap to confirm that the block files were getting memory-mapped 
and locked into the virtual address space of the DataNode JVM process.

http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx


 Support Centralized Cache Management on Windows.
 

 Key: HDFS-5202
 URL: https://issues.apache.org/jira/browse/HDFS-5202
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Chris Nauroth
 Attachments: HDFS-5202.1.patch


 HDFS caching currently is implemented using POSIX syscalls for checking 
 ulimit and locking pages of memory into the process's address space.  These 
 POSIX syscalls do not exist on Windows.  This issue will implement equivalent 
 functionality so that Windows deployments can use Centralized Cache 
 Management.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-5202) Support Centralized Cache Management on Windows.

2014-07-09 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5202:


Status: Patch Available  (was: Open)

 Support Centralized Cache Management on Windows.
 

 Key: HDFS-5202
 URL: https://issues.apache.org/jira/browse/HDFS-5202
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Chris Nauroth
 Attachments: HDFS-5202.1.patch


 HDFS caching currently is implemented using POSIX syscalls for checking 
 ulimit and locking pages of memory into the process's address space.  These 
 POSIX syscalls do not exist on Windows.  This issue will implement equivalent 
 functionality so that Windows deployments can use Centralized Cache 
 Management.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056791#comment-14056791
 ] 

Aaron T. Myers commented on HDFS-6647:
--

Thanks for all the comments, y'all. I agree with what everyone has said here.

While we're on the subject, does it not seem strange to anyone that we allow 
the INode to still be considered under construction in a snapshot after it's 
been deleted from the present FS? I'm thinking that perhaps in addition to this 
change that Kihwal has in this patch we should make delete finalize the INode 
as well. I think that would've prevented this issue as well, since the current 
check in {{checkUCBlock}} would have failed. We could of course do that as a 
separate JIRA, or perhaps not at all if we think this is sufficient as-is.

The patch that Kihwal provided looks good to me. One small comment is that it'd 
be good to use {{GenericTestUtils#assertExceptionContains}} in the test case to 
ensure the correct exception is thrown, but that's pretty minor. +1 once that's 
addressed, either by changing the patch or by telling me I'm being too pedantic.

Kihwal - can I go ahead and assign this JIRA to you?

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6618) Remove deleted INodes from INodeMap right away

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056793#comment-14056793
 ] 

Hadoop QA commented on HDFS-6618:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12654834/HDFS-6618.simpler.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7309//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7309//console

This message is automatically generated.

 Remove deleted INodes from INodeMap right away
 --

 Key: HDFS-6618
 URL: https://issues.apache.org/jira/browse/HDFS-6618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6618.AbstractList.patch, 
 HDFS-6618.inodeRemover.patch, HDFS-6618.inodeRemover.v2.patch, 
 HDFS-6618.patch, HDFS-6618.simpler.patch


 After HDFS-6527, we have not seen the edit log corruption for weeks on 
 multiple clusters until yesterday. Previously, we would see it within 30 
 minutes on a cluster.
 But the same condition was reproduced even with HDFS-6527.  The only 
 explanation is that the RPC handler thread serving {{addBlock()}} was 
 accessing stale parent value.  Although nulling out parent is done inside the 
 {{FSNamesystem}} and {{FSDirectory}} write lock, there is no memory barrier 
 because there is no synchronized block involved in the process.
 I suggest making parent volatile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056803#comment-14056803
 ] 

Kihwal Lee commented on HDFS-6647:
--

bq. does it not seem strange to anyone that we allow the INode to still be 
considered under construction in a snapshot after it's been deleted from the 
present FS.
It does. But I think closing the file in this case is a bit complicated. I can 
think of many corner cases.  The snapshot experts should chime in.

I will address the review comment.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056798#comment-14056798
 ] 

Aaron T. Myers commented on HDFS-6647:
--

Oh,  sorry, Kihwal - shouldn't this code also be checking to ensure that the 
file is under construction as well?

{code}
-if (file == null || !file.isUnderConstruction()) {
+if (file == null || isFileDeleted(file)) {
{code}

i.e. I think it should be:

{code}
if (file == null || !file.isUnderConstruction() || isFileDeleted(file)) {
{code}

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056809#comment-14056809
 ] 

Aaron T. Myers commented on HDFS-6647:
--

bq. It does. But I think closing the file in this case is a bit complicated. I 
can think of many corner cases. The snapshot experts should chime in.

Yea, I figured that'd be more complex. Totally fine to punt on that for now.

Thanks, Kihwal.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056815#comment-14056815
 ] 

Kihwal Lee commented on HDFS-6647:
--

bq. Oh, sorry, Kihwal - shouldn't this code also be checking to ensure that the 
file is under construction as well?
Yes, you have passed the test, Aaron. :)  :) I will get it fixed.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056820#comment-14056820
 ] 

Aaron T. Myers commented on HDFS-6647:
--

You're the man, Kihwal. :)

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6643) Refactor INodeFile.HeaderFormat and INodeWithAdditionalFields.PermissionStatusFormat

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056821#comment-14056821
 ] 

Hadoop QA commented on HDFS-6643:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654706/h6643_20140708b.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7310//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7310//console

This message is automatically generated.

 Refactor INodeFile.HeaderFormat and 
 INodeWithAdditionalFields.PermissionStatusFormat
 

 Key: HDFS-6643
 URL: https://issues.apache.org/jira/browse/HDFS-6643
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h6643_20140708.patch, h6643_20140708b.patch


 The use of them are very similar.  We should change INodeFile.HeaderFormat to 
 enum and refactor them for code reuse.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6647:
-

Attachment: HDFS-6647.v2.patch

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch, 
 HDFS-6647.v2.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6422) getfattr in CLI doesn't throw exception or return non-0 return code when xattr doesn't exist

2014-07-09 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056872#comment-14056872
 ] 

Charles Lamb commented on HDFS-6422:


[~umamaheswararao]

Actually, I don't think checking the owner for listXAttrs is correct. In linux, 
since the equivalent of listXAttrs only requires scan permission on the owning 
directory, I think we should do the same. Therefore, instead of checkOwner, I 
think we want to do checkParentAccess(EXECUTE).

Are you ok with that?



 getfattr in CLI doesn't throw exception or return non-0 return code when 
 xattr doesn't exist
 

 Key: HDFS-6422
 URL: https://issues.apache.org/jira/browse/HDFS-6422
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6422.1.patch, HDFS-6422.2.patch, HDFS-6422.3.patch


 If you do
 hdfs dfs -getfattr -n user.blah /foo
 and user.blah doesn't exist, the command prints
 # file: /foo
 and a 0 return code.
 It should print an exception and return a non-0 return code instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6645) Add test for successive Snapshots between XAttr modifications

2014-07-09 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6645:
--

Attachment: HDFS-6645.001.patch

Chris Nauroth and Giridharan Kesavan fixed the PreCommit jenkins job issues.

Resubmitting identical patch to trigger Hadoop QA.

 Add test for successive Snapshots between XAttr modifications
 -

 Key: HDFS-6645
 URL: https://issues.apache.org/jira/browse/HDFS-6645
 Project: Hadoop HDFS
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-6645.001.patch, HDFS-6645.001.patch, 
 HDFS-6645.001.patch


 In the current TestXAttrWithSnapshot unit tests, we create a single snapshot 
 per test.
 We should test taking multiple snapshots on a path in between XAttr 
 modifications of that path. We should also verify that deletion of a snapshot 
 does not somehow alter the XAttrs of the other snapshots of the same path.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6647) Edit log corruption when pipeline recovery occurs for deleted file present in snapshot

2014-07-09 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056896#comment-14056896
 ] 

Aaron T. Myers commented on HDFS-6647:
--

The latest patch looks good to me. +1 pending HDFS-6618 being committed and the 
Jenkins seal of approval.

 Edit log corruption when pipeline recovery occurs for deleted file present in 
 snapshot
 --

 Key: HDFS-6647
 URL: https://issues.apache.org/jira/browse/HDFS-6647
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, snapshots
Affects Versions: 2.4.1
Reporter: Aaron T. Myers
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6647-failing-test.patch, HDFS-6647.patch, 
 HDFS-6647.v2.patch


 I've encountered a situation wherein an OP_UPDATE_BLOCKS can appear in the 
 edit log for a file after an OP_DELETE has previously been logged for that 
 file. Such an edit log sequence cannot then be successfully read by the 
 NameNode.
 More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HDFS-6652) RecoverLease cannot success and file cannot be closed under high load

2014-07-09 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-6652:
--

Attachment: testLeaseRecoveryWithMultiWriters.patch

Here is an unit test to reproduce this issue.

 RecoverLease cannot success and file cannot be closed under high load
 -

 Key: HDFS-6652
 URL: https://issues.apache.org/jira/browse/HDFS-6652
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Priority: Minor
 Attachments: testLeaseRecoveryWithMultiWriters.patch


 When there are multiple clients try to write to the same file frequently, 
 there is chance that block state goes wrong so lease recovery cannot be done 
 and  file cannot be closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6652) RecoverLease cannot success and file cannot be closed under high load

2014-07-09 Thread Juan Yu (JIRA)
Juan Yu created HDFS-6652:
-

 Summary: RecoverLease cannot success and file cannot be closed 
under high load
 Key: HDFS-6652
 URL: https://issues.apache.org/jira/browse/HDFS-6652
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Priority: Minor
 Attachments: testLeaseRecoveryWithMultiWriters.patch

When there are multiple clients try to write to the same file frequently, there 
is chance that block state goes wrong so lease recovery cannot be done and  
file cannot be closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6652) RecoverLease cannot success and file cannot be closed under high load

2014-07-09 Thread Juan Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056936#comment-14056936
 ] 

Juan Yu commented on HDFS-6652:
---

The lease recovery failure is related to HDFS-4504

 RecoverLease cannot success and file cannot be closed under high load
 -

 Key: HDFS-6652
 URL: https://issues.apache.org/jira/browse/HDFS-6652
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Juan Yu
Priority: Minor
 Attachments: testLeaseRecoveryWithMultiWriters.patch


 When there are multiple clients try to write to the same file frequently, 
 there is chance that block state goes wrong so lease recovery cannot be done 
 and  file cannot be closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-4504) DFSOutputStream#close doesn't always release resources (such as leases)

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056945#comment-14056945
 ] 

Hadoop QA commented on HDFS-4504:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12599081/HDFS-4504.016.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7313//console

This message is automatically generated.

 DFSOutputStream#close doesn't always release resources (such as leases)
 ---

 Key: HDFS-4504
 URL: https://issues.apache.org/jira/browse/HDFS-4504
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, 
 HDFS-4504.007.patch, HDFS-4504.008.patch, HDFS-4504.009.patch, 
 HDFS-4504.010.patch, HDFS-4504.011.patch, HDFS-4504.014.patch, 
 HDFS-4504.015.patch, HDFS-4504.016.patch


 {{DFSOutputStream#close}} can throw an {{IOException}} in some cases.  One 
 example is if there is a pipeline error and then pipeline recovery fails.  
 Unfortunately, in this case, some of the resources used by the 
 {{DFSOutputStream}} are leaked.  One particularly important resource is file 
 leases.
 So it's possible for a long-lived HDFS client, such as Flume, to write many 
 blocks to a file, but then fail to close it.  Unfortunately, the 
 {{LeaseRenewerThread}} inside the client will continue to renew the lease for 
 the undead file.  Future attempts to close the file will just rethrow the 
 previous exception, and no progress can be made by the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5202) Support Centralized Cache Management on Windows.

2014-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056958#comment-14056958
 ] 

Hadoop QA commented on HDFS-5202:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12654878/HDFS-5202.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.
See 
https://builds.apache.org/job/PreCommit-HDFS-Build/7311//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.shell.TestCopyPreserveFlag
  org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
  org.apache.hadoop.fs.shell.TestTextCommand
  org.apache.hadoop.ipc.TestIPC
  org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
  org.apache.hadoop.fs.shell.TestPathData
  org.apache.hadoop.fs.TestDFVariations

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7311//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7311//console

This message is automatically generated.

 Support Centralized Cache Management on Windows.
 

 Key: HDFS-5202
 URL: https://issues.apache.org/jira/browse/HDFS-5202
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0, 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Chris Nauroth
 Attachments: HDFS-5202.1.patch


 HDFS caching currently is implemented using POSIX syscalls for checking 
 ulimit and locking pages of memory into the process's address space.  These 
 POSIX syscalls do not exist on Windows.  This issue will implement equivalent 
 functionality so that Windows deployments can use Centralized Cache 
 Management.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >