[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2020-06-10 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132544#comment-17132544
 ] 

Xiaoqiao He commented on HDFS-12733:


Thanks [~ayushtkn] for your reviews.
{quote}This block could have pitched in if noShared was empty, Now this will 
not happen, Is this desirable?{quote}
If `dfs.namenode.edits.dir` and `dfs.namenode.shared.edits.dir` are both empty, 
this logic will be invoke and use DFSConfigKeys.DFS_NAMENODE_EDITS_DIR_DEFAULT 
as edits directory by default. The unit test 
TestFailureOfSharedDir#testLocalEditsDisabledWithoutSharedEditsDir is just for 
this case.
{quote}Shouldn't we check have a check that editsDirs isn't empty as 
well?{quote}
The original thought is that we should ignore `noSharedEditDirs` only when it 
set empty, rather than ignore it when `shareEditDirs` is not empty.

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch, HDFS-12733.007.patch, HDFS-12733.008.patch, 
> HDFS-12733.009.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12969) DfsAdmin listOpenFiles should report files by type

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130934#comment-17130934
 ] 

hemanthboyina commented on HDFS-12969:
--

any suggestions for this   [~liuml07]  [~elgoiri]

> DfsAdmin listOpenFiles should report files by type
> --
>
> Key: HDFS-12969
> URL: https://issues.apache.org/jira/browse/HDFS-12969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Major
>
> HDFS-11847 has introduced a new option to {{-blockingDecommission}} to an 
> existing command 
> {{dfsadmin -listOpenFiles}}. But the reporting done by the command doesn't 
> differentiate the files based on the type (like blocking decommission). In 
> order to change the reporting style, the proto format used for the base 
> command has to be updated to carry additional fields and better be done in a 
> new jira outside of HDFS-11847. This jira is to track the end-to-end 
> enhancements needed for dfsadmin -listOpenFiles console output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15351) Blocks Scheduled Count was wrong on Truncate

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130927#comment-17130927
 ] 

hemanthboyina commented on HDFS-15351:
--

yes  [~elgoiri]  , i think we can go ahead

> Blocks Scheduled Count was wrong on Truncate 
> -
>
> Key: HDFS-15351
> URL: https://issues.apache.org/jira/browse/HDFS-15351
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15351.001.patch, HDFS-15351.002.patch, 
> HDFS-15351.003.patch
>
>
> On truncate and append we remove the blocks from Reconstruction Queue 
> On removing the blocks from pending reconstruction , we need to decrement 
> Blocks Scheduled 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15372) Files in snapshots no longer see attribute provider permissions

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130923#comment-17130923
 ] 

hemanthboyina commented on HDFS-15372:
--

thanks for the patch [~sodonnell] , the changes looks good

> Files in snapshots no longer see attribute provider permissions
> ---
>
> Key: HDFS-15372
> URL: https://issues.apache.org/jira/browse/HDFS-15372
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15372.001.patch, HDFS-15372.002.patch, 
> HDFS-15372.003.patch, HDFS-15372.004.patch, HDFS-15372.005.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the 
> paths covered by the provider are snapshotable, there was a change in 
> behaviour in how the provider permissions and ACLs are applied to files in 
> snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs 
> below are provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider 
> permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner: 
> # group: 
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively 
> refactored) snapshots did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the 
> attribute provider and passes the path we want permissions for:
> {code}
>   INodeAttributes getAttributes(INodesInPath iip)
>   throws IOException {
> INode node = FSDirectory.resolveLastINode(iip);
> int snapshot = iip.getPathSnapshotId();
> INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
> UserGroupInformation ugi = NameNode.getRemoteUser();
> INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
> if (ap != null) {
>   // permission checking sends the full components array including the
>   // first empty component for the root.  however file status
>   // related calls are expected to strip out the root component according
>   // to TestINodeAttributeProvider.
>   byte[][] components = iip.getPathComponents();
>   components = Arrays.copyOfRange(components, 1, components.length);
>   nodeAttrs = ap.getAttributes(components, nodeAttrs);
> }
> return nodeAttrs;
>   }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents, 
> for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It 
> resolves the snapshot path to its original location, but its still the 
> snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns 
> "/user/.snapshot/snap1/tab" to the provider.
> The pre Hadoop 3.0 code passes the inode directly to the provider, and hence 
> it only ever sees the path as "/user/data/tab1".
> It is debatable which path should be passed to the provider - 
> /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as 
> the behaviour has changed I feel we should ensure the old behaviour is 
> retained.
> It would also be fairly easy to provide a config switch so the provider gets 
> the full snapshot path or the resolved path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15346) RBF: DistCpFedBalance implementation

2020-06-10 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130903#comment-17130903
 ] 

Yiqun Lin commented on HDFS-15346:
--

[~LiJinglun] , thanks for addressing the comments, almost looks good now.
{quote}Agree with you ! Using a fedbalance-default.xml is much better.
{quote}
Would you create a subtask JIRA for this? Let's try to complete this in a later 
time.
{quote}I'll try to figure it out. But it might be quite tricky as the unit 
tests use both MiniDFSCluster and MiniMRYarnCluster. And there are many rounds 
of distcp. Please tell me if you have any suggestions, thanks
{quote}
I will take a further look for this later. But anyway, currently the unit tests 
can all be passed, it's okay for me.

Still some remaining minor comments:

*hadoop-federation-balance/pom.xml*
{noformat}
+
+  org.bouncycastle
+  bcprov-jdk15on
+  test
+
+
+  org.bouncycastle
+  bcpkix-jdk15on
+  test
+
{noformat}
These two dependencies seems not related, can we remove this one?
 *DistCpFedBalance.java/FedBalance.java*
 I don't know why we define another class FedBalance. This FedBalance can just 
combined to DistCpFedBalance. I prefer to override main method in 
DistCpFedBalance and then renamed DistCpFedBalance to FedBalance.

*DistCpBalanceOptions.java*
 Find two places can be described more clear:
 # I prefer to move detailed comment message into option description and users 
can known detailed about this option.
{code:java}
/**
 * Run in router-based federation mode.
 */
final static Option ROUTER =
new Option("router", false, ". If `true` the command runs in router mode. 
The source path is taken as
   a mount point. It will disable write by setting the mount point
   readonly. Otherwise the command works in normal federation mode. The
   source path is taken as the full path. It will disable read and write by
   cancelling all permissions of the source path. The default value
   is `false`.");
{code}
 

 # The description of delay option is hard to understand. I make a minor change 
for this. [~LiJinglun], if you have a better description for this option, feel 
free to update your change on this.
{code:java}
/* Specify the delayed duration(millie seconds) to recover the Job.*/
final static Option DELAY_DURATION = new Option("delay", true,
  "The delayed duration(millie seconds) to recover the Job continue to run 
when the job is detected that it hasn't been finished and waits to complete.");
{code}

*DistCpProcedure.java*
 # Move {{srcFs.allowSnapshot(src);}} to at the end of method. Only after the 
snapshot check, then we do the allow snapshot opertion.
{code:java}
+
+  private void cleanUpBeforeInitDistcp() throws IOException {
+if (dstFs.exists(dst)) { // clean up.
+  throw new IOException("The dst path=" + dst + " already exists. The 
admin"
+  + " should delete it before submitting the initial distcp job.");
+}
+Path snapshotPath = new Path(src,
+HdfsConstants.DOT_SNAPSHOT_DIR_SEPARATOR + CURRENT_SNAPSHOT_NAME);
+if (srcFs.exists(snapshotPath)) {
+  throw new IOException("The src snapshot=" + snapshotPath +
+  " already exists. The admin should delete the snapshot before"
+  + " submitting the initial distcp.");
+}
 srcFs.allowSnapshot(src); <--- move to here 
+  }
{code}

*FedBalanceContext.java*
 # Please add necessary dot in toString method, like this:
{code:java}
  public String toString() {
StringBuilder builder = new StringBuilder("FedBalance context:");
builder.append(" src=").append(src);
builder.append(", dst=").append(dst);
if (useMountReadOnly) {
  builder.append(", router-mode=true");
  builder.append(", mount-point=").append(mount);
} else {
  builder.append(", router-mode=false");
}
builder.append(", forceCloseOpenFiles=").append(forceCloseOpenFiles);
builder.append(", trash=").append(trashOpt.name());
builder.append(", map=").append(mapNum);
builder.append(", bandwidth=").append(bandwidthLimit);
return builder.toString();
  }
{code}

 # Can you add new added option delayDuration option into this class?

> RBF: DistCpFedBalance implementation
> 
>
> Key: HDFS-15346
> URL: https://issues.apache.org/jira/browse/HDFS-15346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15346.001.patch, HDFS-15346.002.patch, 
> HDFS-15346.003.patch, HDFS-15346.004.patch, HDFS-15346.005.patch, 
> HDFS-15346.006.patch, HDFS-15346.007.patch, HDFS-15346.008.patch, 
> HDFS-15346.009.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the second one. Detail can be found at HDFS-15294.



--
This message was sent by 

[jira] [Commented] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-10 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130654#comment-17130654
 ] 

ludun commented on HDFS-15406:
--

we get the datanode jstack, with 11M block , found that getDiskReport run 
nearly 23 min,then hold lock to process scan about 6 min.
{code}
// getDiskReport  start
2020-06-10 11:48:14
--
"java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue]" #707 daemon prio=5 os_prio=0 tid=0x902e7800 nid=0xc681 waiting 
on condition [0xfff71c0bd000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xfff7d4f73220> (a 
java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:549)
at 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393)
---
2020-06-10 12:11:36
--
"java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue]" #707 daemon prio=5 os_prio=0 tid=0x902e7800 nid=0xc681 runnable 
[0xfff71c0bd000]
   java.lang.Thread.State: RUNNABLE
at java.util.ComparableTimSort.mergeHi(ComparableTimSort.java:817)
at java.util.ComparableTimSort.mergeAt(ComparableTimSort.java:483)
at 
java.util.ComparableTimSort.mergeForceCollapse(ComparableTimSort.java:422)
at java.util.ComparableTimSort.sort(ComparableTimSort.java:222)
at java.util.Arrays.sort(Arrays.java:1246)
at 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ScanInfoPerBlockPool.toSortedArrays(DirectoryScanner.java:204)
at 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:574)
at 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393)
{code}


> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has taken nearly 5mins
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>  | InstrumentedLock.java:143 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130534#comment-17130534
 ] 

Stephen O'Donnell commented on HDFS-15160:
--

[~Jiang Xin] Have you been able to test this patch on your cluster? If so did 
it work well?

For the others interested in this change - we think this patch is good to 
commit to trunk, but we would like to see it validated on a cluster or two, due 
to the risk in changing the locking. If any of you can try it on a real cluster 
it would be a great help.

[~zhuqi] Previously tried it on his cluster and it seemed to work well - have 
you any further feedback?

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, image-2020-04-10-17-18-08-128.png, 
> image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130521#comment-17130521
 ] 

hemanthboyina commented on HDFS-15160:
--

though the scanner takes more time to scan all the blocks , filed HDFS-15406 to 
improvise it

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, image-2020-04-10-17-18-08-128.png, 
> image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-10 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15406:
-
Description: 
In our customer cluster we have approx 10M blocks in one datanode 

the Datanode to scans all the blocks , it has taken nearly 5mins
{code:java}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143 {code}

  was:
In our customer cluster we have approx 10M blocks in one datanode 

When Datanode scans all the blocks , it has taken more time 
{code:java}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143 {code}


> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has 

[jira] [Updated] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-10 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15406:
-
Description: 
In our customer cluster we have approx 10M blocks in one datanode 

When Datanode scans all the blocks , it has taken more time 

> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
>
> In our customer cluster we have approx 10M blocks in one datanode 
> When Datanode scans all the blocks , it has taken more time 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-10 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15406:
-
Description: 
In our customer cluster we have approx 10M blocks in one datanode 

When Datanode scans all the blocks , it has taken more time 
{code:java}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143 {code}

  was:
In our customer cluster we have approx 10M blocks in one datanode 

When Datanode scans all the blocks , it has taken more time 


> Improve the speed of Datanode Block Scan
> 
>
> Key: HDFS-15406
> URL: https://issues.apache.org/jira/browse/HDFS-15406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
>
> In our customer cluster we have approx 10M blocks in one datanode 
> When Datanode scans all the blocks , it has taken more time 
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 

[jira] [Created] (HDFS-15406) Improve the speed of Datanode Block Scan

2020-06-10 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-15406:


 Summary: Improve the speed of Datanode Block Scan
 Key: HDFS-15406
 URL: https://issues.apache.org/jira/browse/HDFS-15406
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: hemanthboyina
Assignee: hemanthboyina






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130497#comment-17130497
 ] 

hemanthboyina edited comment on HDFS-15160 at 6/10/20, 10:48 AM:
-

thanks [~pilchard] for the report ,  HDFS-15150 has introduced read and write 
lock in datanode 

With HDFS-15160 we acquire read lock for scanner , so the write wont be blocked 


was (Author: hemanthboyina):
thanks [~pilchard] for the report ,  HDFS-15150 has introduced read and write 
lock in datanode 

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, image-2020-04-10-17-18-08-128.png, 
> image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130497#comment-17130497
 ] 

hemanthboyina commented on HDFS-15160:
--

thanks [~pilchard] for the report ,  HDFS-15150 has introduced read and write 
lock in datanode 

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, image-2020-04-10-17-18-08-128.png, 
> image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130342#comment-17130342
 ] 

ludun edited comment on HDFS-15160 at 6/10/20, 10:03 AM:
-

[~brahmareddy] pls check this issue also.  in our custome enviremont.  
Directory Scanner  block the write for a long time(300s+). although they have 
too many blocks for datanode. but we should not hold write lock for scanner.
{code}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-xx-1571300215588 Total blocks: 11149530, 
missing metadata files:472, missing block files:472, missing blocks in 
memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143
{code}


was (Author: pilchard):
[~brahmareddy] pls check this issue also.  in our custome enviremont.  
Directory Scanner  block the write for a long time(300s+). although they have 
too many blocks for datanode. but we should not hold write lock for scanner.
{code}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-10.2.245.4-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143
{code}

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> 

[jira] [Commented] (HDFS-15346) RBF: DistCpFedBalance implementation

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130351#comment-17130351
 ] 

Hadoop QA commented on HDFS-15346:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
0s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
26s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  6m 
23s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} branch/hadoop-project no findbugs output file 
(findbugsXml.xml) {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} branch/hadoop-assemblies no findbugs output file 
(findbugsXml.xml) {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} branch/hadoop-tools/hadoop-tools-dist no findbugs 
output file (findbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  7m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
45s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
6s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
45s{color} | {color:blue} hadoop-project has no data from findbugs {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
44s{color} | {color:blue} hadoop-assemblies has no data from findbugs {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} hadoop-tools/hadoop-tools-dist has no 

[jira] [Comment Edited] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130342#comment-17130342
 ] 

ludun edited comment on HDFS-15160 at 6/10/20, 7:35 AM:


[~brahmareddy] pls check this issue also.  in our custome enviremont.  
Directory Scanner  block the write for a long time(300s+). although they have 
too many blocks for datanode. but we should not hold write lock for scanner.
{code}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-10.2.245.4-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143
{code}


was (Author: pilchard):
[~brahmareddy] pls check this issue also.  in our custome enviremont.  
Directory Scanner  block the write for a long time.
{code}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-10.2.245.4-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143
{code}

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> 

[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-06-10 Thread ludun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130342#comment-17130342
 ] 

ludun commented on HDFS-15160:
--

[~brahmareddy] pls check this issue also.  in our custome enviremont.  
Directory Scanner  block the write for a long time.
{code}
2020-06-10 12:17:06,869 | INFO  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | BlockPool BP-1104115233-10.2.245.4-1571300215588 Total blocks: 
11149530, missing metadata files:472, missing block files:472, missing blocks 
in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
2020-06-10 12:17:06,869 | WARN  | 
java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
queue] | Lock held time above threshold: lock identifier: 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
 | InstrumentedLock.java:143
{code}

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, image-2020-04-10-17-18-08-128.png, 
> image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-15398) EC: hdfs client hangs due to exception during addBlock

2020-06-10 Thread Hongbing Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongbing Wang updated HDFS-15398:
-
Comment: was deleted

(was: These failures seem to be related to 
[HDFS-15396|https://issues.apache.org/jira/browse/HDFS-15396] , but no tests 
provided?)

> EC: hdfs client hangs due to exception during addBlock
> --
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in addBlock, 

[jira] [Commented] (HDFS-15398) EC: hdfs client hangs due to exception during addBlock

2020-06-10 Thread Hongbing Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130301#comment-17130301
 ] 

Hongbing Wang commented on HDFS-15398:
--

These failures seem to be related to 
[HDFS-15396|https://issues.apache.org/jira/browse/HDFS-15396] , but no tests 
provided?

> EC: hdfs client hangs due to exception during addBlock
> --
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in 

[jira] [Commented] (HDFS-15398) EC: hdfs client hangs due to exception during addBlock

2020-06-10 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130294#comment-17130294
 ] 

Hudson commented on HDFS-15398:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18344 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18344/])
HDFS-15398. EC: hdfs client hangs due to exception during addBlock. 
(ayushsaxena: rev b735a777178a3be7924b0ea7c0f61003dc60f16e)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedOutputStream.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamUpdatePipeline.java


> EC: hdfs client hangs due to exception during addBlock
> --
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> 

[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130288#comment-17130288
 ] 

Hadoop QA commented on HDFS-15098:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} prototool {color} | {color:blue}  0m  
0s{color} | {color:blue} prototool was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
32s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 17m 17s{color} | 
{color:red} root generated 29 new + 133 unchanged - 29 fixed = 162 total (was 
162) {color} |
| {color:green}+1{color} | {color:green} golang {color} | {color:green} 17m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 17m 17s{color} 
| {color:red} root generated 4 new + 1865 unchanged - 0 fixed = 1869 total (was 
1865) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 50s{color} | {color:orange} root: The patch generated 4 new + 211 unchanged 
- 5 fixed = 215 total (was 216) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
29s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
8s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not 

[jira] [Updated] (HDFS-15398) EC: hdfs client hangs due to exception during addBlock

2020-06-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15398:

Fix Version/s: 3.4.0
   3.3.1
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> EC: hdfs client hangs due to exception during addBlock
> --
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in addBlock, the 

[jira] [Commented] (HDFS-15398) EC: hdfs client hangs due to exception during addBlock

2020-06-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130274#comment-17130274
 ] 

Ayush Saxena commented on HDFS-15398:
-

Committed to trunk, branch-3.3
Thanx [~wanghongbing] for the contribution!!!

> EC: hdfs client hangs due to exception during addBlock
> --
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in addBlock, the program will call 
> DFSStripedOutputStream.closeImpl() -> flushBuffer() -> 

[jira] [Commented] (HDFS-15376) Update the error about command line POST in httpfs documentation

2020-06-10 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130167#comment-17130167
 ] 

Hudson commented on HDFS-15376:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18343 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18343/])
HDFS-15376. Update the error about command line POST in httpfs (ayushsaxena: 
rev 635e6a16d0f407eeec470f2d4d3303092961a177)
* (edit) hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/index.md


> Update the error about command line POST in httpfs documentation
> 
>
> Key: HDFS-15376
> URL: https://issues.apache.org/jira/browse/HDFS-15376
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.2.1
>Reporter: bianqi
>Assignee: bianqi
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15376.001.patch
>
>
>    In the official Hadoop documentation, there is an exception when executing 
> the following command.
> {quote} {{curl -X POST 
> 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS=foo'}} 
> creates the HDFS {{/user/foo/bar}} directory.
> {quote}
>      Command line returns results:
> {quote}     *{"RemoteException":{"message":"Invalid HTTP POST operation 
> [MKDIRS]","exception":"IOException","javaClassName":"java.io.IOException"}}*
> {quote}
>      
> I checked the source code and found that the way to create the file should 
> use PUT to submit the form.
>     I modified to execute the command in PUT mode and got the result as 
> follows
> {quote}     {{curl -X PUT 
> 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS=foo'}} 
> creates the HDFS {{/user/foo/bar}} directory.
> {quote}
>      Command line returns results:
> {"boolean":true}
> . At the same time the folder is created successfully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15398) EC: hdfs client hangs due to exception during addBlock

2020-06-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15398:

Summary: EC: hdfs client hangs due to exception during addBlock  (was: EC: 
hdfs client hangs when writing EC file occurs an addBlock exception)

> EC: hdfs client hangs due to exception during addBlock
> --
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in addBlock, the program will call 
> 

[jira] [Assigned] (HDFS-15398) EC: hdfs client hangs when writing EC file occurs an addBlock exception

2020-06-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HDFS-15398:
---

Assignee: Hongbing Wang

> EC: hdfs client hangs when writing EC file occurs an addBlock exception
> ---
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Critical
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in addBlock, the program will call 
> DFSStripedOutputStream.closeImpl() -> flushBuffer() -> writeChunk() -> 
> allocateNewBlock() -> 

[jira] [Commented] (HDFS-15398) EC: hdfs client hangs when writing EC file occurs an addBlock exception

2020-06-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130162#comment-17130162
 ] 

Ayush Saxena commented on HDFS-15398:
-

Test failures seems unrelated.
v004 LGTM +1

> EC: hdfs client hangs when writing EC file occurs an addBlock exception
> ---
>
> Key: HDFS-15398
> URL: https://issues.apache.org/jira/browse/HDFS-15398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, hdfs-client
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Priority: Critical
> Attachments: HDFS-15398.001.patch, HDFS-15398.002.patch, 
> HDFS-15398.003.patch, HDFS-15398.004.patch
>
>
>  In the operation of writing EC files, when the client calls addBlock() 
> applying for the second block group (or >= the second block group) and it 
> happens to exceed quota at this time, the client program will hang forever. 
>  See the demo below:
> {code:java}
> $ hadoop fs -mkdir -p /user/wanghongbing/quota/ec
> $ hdfs dfsadmin -setSpaceQuota 2g /user/wanghongbing/quota
> $ hdfs ec -setPolicy -path /user/wanghongbing/quota/ec -policy RS-6-3-1024k
> Set RS-6-3-1024k erasure coding policy on /user/wanghongbing/quota/ec
> $ hadoop fs -put 800m /user/wanghongbing/quota/ec
> ^@^@^@^@^@^@^@^@^Z
> {code}
> In the case of blocksize=128M, spaceQuota=2g and EC 6-3 policy, a block group 
> needs to apply for 1152M physical space to write 768M logical data. 
> Therefore, writing 800M data will exceed quota when applying for the second 
> block group. At this point, the client will be hang forever.
> The exception stack of client is as follows:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x8009d5d8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.takeWithTimeout(DFSStripedOutputStream.java:117)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.waitEndBlocks(DFSStripedOutputStream.java:453)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:477)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:541)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:164)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:145)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.closeImpl(DFSStripedOutputStream.java:1182)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:847)
> - locked <0x8009f758> (a 
> org.apache.hadoop.hdfs.DFSStripedOutputStream)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> at org.apache.hadoop.io.IOUtils.cleanupWithLogger(IOUtils.java:280)
> at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:298)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:77)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:485)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:407)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:342)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
> at 
> org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
> {code}
> When an exception occurs in addBlock, the program will call 
> DFSStripedOutputStream.closeImpl() -> flushBuffer() -> writeChunk() -> 
> allocateNewBlock() 

[jira] [Commented] (HDFS-15376) Update the error about command line POST in httpfs documentation

2020-06-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130125#comment-17130125
 ] 

Ayush Saxena commented on HDFS-15376:
-

Committed to trunk.
Thanx [~bianqi] for the contribution and [~elgoiri] for the review!!!

> Update the error about command line POST in httpfs documentation
> 
>
> Key: HDFS-15376
> URL: https://issues.apache.org/jira/browse/HDFS-15376
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.2.1
>Reporter: bianqi
>Assignee: bianqi
>Priority: Major
> Attachments: HDFS-15376.001.patch
>
>
>    In the official Hadoop documentation, there is an exception when executing 
> the following command.
> {quote} {{curl -X POST 
> 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS=foo'}} 
> creates the HDFS {{/user/foo/bar}} directory.
> {quote}
>      Command line returns results:
> {quote}     *{"RemoteException":{"message":"Invalid HTTP POST operation 
> [MKDIRS]","exception":"IOException","javaClassName":"java.io.IOException"}}*
> {quote}
>      
> I checked the source code and found that the way to create the file should 
> use PUT to submit the form.
>     I modified to execute the command in PUT mode and got the result as 
> follows
> {quote}     {{curl -X PUT 
> 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS=foo'}} 
> creates the HDFS {{/user/foo/bar}} directory.
> {quote}
>      Command line returns results:
> {"boolean":true}
> . At the same time the folder is created successfully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15376) Update the error about command line POST in httpfs documentation

2020-06-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15376:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Update the error about command line POST in httpfs documentation
> 
>
> Key: HDFS-15376
> URL: https://issues.apache.org/jira/browse/HDFS-15376
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 3.2.1
>Reporter: bianqi
>Assignee: bianqi
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15376.001.patch
>
>
>    In the official Hadoop documentation, there is an exception when executing 
> the following command.
> {quote} {{curl -X POST 
> 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS=foo'}} 
> creates the HDFS {{/user/foo/bar}} directory.
> {quote}
>      Command line returns results:
> {quote}     *{"RemoteException":{"message":"Invalid HTTP POST operation 
> [MKDIRS]","exception":"IOException","javaClassName":"java.io.IOException"}}*
> {quote}
>      
> I checked the source code and found that the way to create the file should 
> use PUT to submit the form.
>     I modified to execute the command in PUT mode and got the result as 
> follows
> {quote}     {{curl -X PUT 
> 'http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=MKDIRS=foo'}} 
> creates the HDFS {{/user/foo/bar}} directory.
> {quote}
>      Command line returns results:
> {"boolean":true}
> . At the same time the folder is created successfully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15405) DataXceiver error processing READ_BLOCK operation src: /10.10.10.87:37424 dst: /10.10.10.87:50010

2020-06-10 Thread wenbin lee (Jira)
wenbin lee created HDFS-15405:
-

 Summary: DataXceiver error processing READ_BLOCK operation  src: 
/10.10.10.87:37424 dst: /10.10.10.87:50010
 Key: HDFS-15405
 URL: https://issues.apache.org/jira/browse/HDFS-15405
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.0
Reporter: wenbin lee


datanode service restart, datanode logfile generate many error information:

 

ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
S10-870.server.baihe:50010:DataXceiver error processing READ_BLOCK operation 
src: /10.10.10.87:37424 dst: /10.10.10.87:50010
java.io.IOException: Replica gen stamp < block genstamp, 
block=BP-1354516653-10.10.10.33-1532503068514:blk_1080284948_6544482, 
replica=ReplicaWaitingToBeRecovered, blk_1080284948_6544202, RWR
 getNumBytes() = 6127077
 getBytesOnDisk() = 6127077
 getVisibleLength()= -1
 getVolume() = /home/disk3/dfs/dn/current
 getBlockFile() = 
/home/disk3/dfs/dn/current/BP-1354516653-10.10.10.33-1532503068514/current/rbw/blk_1080284948
 unlinked=false
 at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:240)
 at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:495)
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:234)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org