from:"Andrew Wang \(JIRA\)"

[jira] [Commented] (HDFS-12946) Add a tool to check rack configuration against EC policies

2018-10-16 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652115#comment-16652115
 ] 

Andrew Wang commented on HDFS-12946:


Hi folks, thanks for working on this! Catching up on the discussion, this is a 
nice change and is something I've hit before too (though hopefully not 
something we see too often in production).

What I'd ask (and about most "monitoring" type applications) is about the 
usecase. Cluster admins want to automate their alerting and reporting. If 
they've gotten to the point that they need to take some manual action (e.g. use 
fsck, {{hdfs debug}}, call this new RPC), it's because something external has 
told them there is an issue. I go to interactive debugging tools to provide the 
next level of detail for alerts that can't be easily automated.

In this case, it seems like most users would want to automate an alert based on 
the metric. It's similar to mis-replication. The RPC isn't as useful IMO since 
it doesn't tell you anything extra, though I would suggest logging a WARN/ERROR 
when enabling an EC policy and this condition is true.

Are there any existing ways of querying the cluster topology and enabled EC 
policies, and then computing this client-side? If not, I think this would be a 
more generally useful admin interface than the very-lightweight new RPC.

One code comment is that I would prefer having some booleans for the MXBean 
rather than the integer for additional clarity, since a bare int return type is 
a bit opaque. In code I'd recommend using an enum or named static constants, 
but that doesn't work for the MXBean.

> Add a tool to check rack configuration against EC policies
> --
>
> Key: HDFS-12946
> URL: https://issues.apache.org/jira/browse/HDFS-12946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch, 
> HDFS-12946.03.patch, HDFS-12946.04.fsck.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553150#comment-16553150
 ] 

Andrew Wang commented on HDFS-13672:


Hi Gabor, I took a quick look. It doesn't look like the iterator will make 
forward progress between iterations since it's starting from the beginning each 
time. What we need here is a tail iterator that starts at the last processed 
element.

I recommend we close this as wontfix and then open a new JIRA to figure out if 
want to disable this feature (which is incompatible) or do some kind of smarter 
detection as to whether it's necessary. As an example, we check if encryption 
is being used by seeing if there are any encryption zones created.

It's also worth asking if it's worth making a behavior change at all, since 
this long blocking scan probably will only happen during debugging situations 
(and we have a workaround).

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch, 
> HDFS-13672.003.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica

2018-07-09 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536996#comment-16536996
 ] 

Andrew Wang commented on HDFS-13658:


Hi Kitti, thanks for working on this! IIUC in your patch, it calls 
updateOneReplicaBlocks in different places in BlockManager to track this 
metric. However, don't we already have this metric in LowRedundancyBlock, via 
the size of the highest priority queue? This would be an easy way of also 
handling the EC case, since it uses the highest priority queue for minimally 
durable blocks. Exposing the lengths of these different queues might be 
interesting more generically, since it would give more detailed insight into NN 
recovery activities. I'll also note that countNodes is a somewhat expensive 
function, so it's not good to be calling it frequently in the BM.

A few other comments:

* ClientProtocol#getStats is deprecated so we shouldn't be putting new fields 
there. I think getReplicatedBlockStats and getECBlockGroupStats are the correct 
replacements. Similar for the new beans, there are Replicated and EC classes, 
shouldn't go into NameNodeMXBean.
* Do we need the fsck changes? fsck already shows the number of 
under-replicated blocks, which is a very similar sign that the cluster is not 
healthy. If an admin isn't seeing the existing fsck metric, they aren't going 
to see this one either. This would save us making the protocol changes, if 
we're just exposing new NN metrics.

> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica
> ---
>
> Key: HDFS-13658
> URL: https://issues.apache.org/jira/browse/HDFS-13658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, 
> HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch, 
> HDFS-13658.006.patch, HDFS-13658.007.patch
>
>
> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica. We have had many cases opened in which a customer has lost a disk 
> or a DN losing files/blocks due to the fact that they had blocks with only 1 
> replica. We need to make the customer better aware of this situation and that 
> they should take action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13719) Docs around dfs.image.transfer.timeout are misleading

2018-07-09 Thread Andrew Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13719:
---
   Resolution: Fixed
Fix Version/s: 3.0.4
   3.1.1
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks for the contribution Kitti! I've committed this to trunk, branch-3.1, 
branch-3.0, and branch-2.

> Docs around dfs.image.transfer.timeout are misleading
> -
>
> Key: HDFS-13719
> URL: https://issues.apache.org/jira/browse/HDFS-13719
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: hdfs
> Fix For: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>
> Attachments: HDFS-13719-branch-2.001.patch, HDFS-13719.001.patch, 
> HDFS-13719.002.patch
>
>
> The Jira https://issues.apache.org/jira/browse/HDFS-1490 added the parameter 
> dfs.image.transfer.timeout to HDFS. From the patch (and checking the current 
> code), we can see this parameter governs a socket timeout on the a 
> java.net.HttpURLConnection object:
> {code:java}
> +if (timeout <= 0) {
> +  // Set the ping interval as timeout
> +  Configuration conf = new HdfsConfiguration();
> +  timeout = conf.getInt(DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
> +  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
> +}
> +
> +if (timeout > 0) {
> +  connection.setConnectTimeout(timeout);
> +  connection.setReadTimeout(timeout);
> +}
> +
> {code}
> In the above 'connection' is a java.net.HttpURLConnection.
> There is a general disbelief in the community that dfs.image.transfer.timeout 
> is the time the entire image must transfer within, however that does not 
> appear to be the case. The timeout is actually the max time the client will 
> block on the socket before giving up if it cannot get data to read. I guess 
> the idea here is to protect the client from hanging forever if the server 
> hangs.
> The docs in hdfs-site.xml are partly what causes this confusion, as they are 
> very misleading:
> {code:xml}
> 
>   dfs.image.transfer.timeout
>   6
>   
> Socket timeout for image transfer in milliseconds. This timeout and 
> the related
> dfs.image.transfer.bandwidthPerSec parameter should be configured such
> that normal image transfer can complete successfully.
> This timeout prevents client hangs when the sender fails during
> image transfer. This is socket timeout during image tranfer.
>   
> 
> {code}
> The start and end of the statement is accurate, but the part "This timeout 
> and the related dfs.image.transfer.bandwidthPerSec parameter should be 
> configured such that normal image transfer can complete successfully." is 
> misleading. There is almost never a reason to change the above in conjunction 
> with the bandwidth setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13719) Docs around dfs.image.transfer.timeout are misleading

2018-07-05 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533545#comment-16533545
 ] 

Andrew Wang commented on HDFS-13719:


Nice find. This looks like a holdover from when we changed checkpointing upload 
from a nested GET to a PUT, so the timeout now works as you say.

Let's delete the sentence "The maximum bandwidth should be set such that normal 
image transfers can complete successfully." since it's also a holdover from 
when it was a nested GET. Now it no longer matters.

Otherwise LGTM, thanks for finding and fixing this Kitti! We can try 
backporting this all the way to branch-2 if we want.

> Docs around dfs.image.transfer.timeout are misleading
> -
>
> Key: HDFS-13719
> URL: https://issues.apache.org/jira/browse/HDFS-13719
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
>  Labels: hdfs
> Attachments: HDFS-13719.001.patch
>
>
> The Jira https://issues.apache.org/jira/browse/HDFS-1490 added the parameter 
> dfs.image.transfer.timeout to HDFS. From the patch (and checking the current 
> code), we can see this parameter governs a socket timeout on the a 
> java.net.HttpURLConnection object:
> {code:java}
> +if (timeout <= 0) {
> +  // Set the ping interval as timeout
> +  Configuration conf = new HdfsConfiguration();
> +  timeout = conf.getInt(DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
> +  DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
> +}
> +
> +if (timeout > 0) {
> +  connection.setConnectTimeout(timeout);
> +  connection.setReadTimeout(timeout);
> +}
> +
> {code}
> In the above 'connection' is a java.net.HttpURLConnection.
> There is a general disbelief in the community that dfs.image.transfer.timeout 
> is the time the entire image must transfer within, however that does not 
> appear to be the case. The timeout is actually the max time the client will 
> block on the socket before giving up if it cannot get data to read. I guess 
> the idea here is to protect the client from hanging forever if the server 
> hangs.
> The docs in hdfs-site.xml are partly what causes this confusion, as they are 
> very misleading:
> {code:xml}
> 
>   dfs.image.transfer.timeout
>   6
>   
> Socket timeout for image transfer in milliseconds. This timeout and 
> the related
> dfs.image.transfer.bandwidthPerSec parameter should be configured such
> that normal image transfer can complete successfully.
> This timeout prevents client hangs when the sender fails during
> image transfer. This is socket timeout during image tranfer.
>   
> 
> {code}
> The start and end of the statement is accurate, but the part "This timeout 
> and the related dfs.image.transfer.bandwidthPerSec parameter should be 
> configured such that normal image transfer can complete successfully." is 
> misleading. There is almost never a reason to change the above in conjunction 
> with the bandwidth setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-03 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531431#comment-16531431
 ] 

Andrew Wang commented on HDFS-13672:


Hi Gabor, to respond to your question, I think the very common case will be 
zero lazy persist files, the rare case being "some" (~thousands), and the very 
very rare case lots (~millions).

I agree that holding the lock for a long time is an anti-pattern. I actually 
had a patch I was working on a while ago for a different feature that added a 
safe way of iterating over the block map.

However, for this case I don't know if it's worth spending a lot of time 
optimizing, since the # of corrupt blocks in the system is normally not that 
large. It's a rare situation that a NameNode is transitioned to active while 
missing a lot of DNs like this (and why we have startup safemode checks). This 
probably only happens during debugging, in which case we could also solve this 
problem by setting the scrubber interval to 0 to disable it.

[~jojochuang] what do you think?

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-03 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531371#comment-16531371
 ] 

Andrew Wang commented on HDFS-13672:


Hi Gabor, thanks for working on this,

I don't think it's thread safe to drop the lock while holding onto an iterator 
like this. This is a LinkedSetIterator and will throw a 
ConcurrentModificationException if the set is changed underneath it. We need a 
way to safely resume at a mid-point, and that seems a bit hard with 
LinkedSetIterator as it is.

Since I think the common case here is that there are zero lazy persist files, a 
better (though different) change would be to skip running this scrubber 
entirely if there aren't any lazy persist files. I'm hoping there's an easy way 
to add a counter for this (or some existing way to query if there are any lazy 
persist files).

We also need unit tests for new changes like this. I think you also typo'd the 
config key name with "sec" instead of "millis" or "ms". Config keys also needed 
to be added to hdfs-default.xml with a description for documentation purposes.

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13712) BlockReaderRemote.read() logging improvement

2018-07-03 Thread Andrew Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13712:
---
   Resolution: Fixed
Fix Version/s: 3.0.4
   3.1.1
   3.2.0
   Status: Resolved  (was: Patch Available)

Thanks for filing and fixing this Gergo, committed to trunk, branch-3.1, 
branch-3.0.

> BlockReaderRemote.read() logging improvement
> 
>
> Key: HDFS-13712
> URL: https://issues.apache.org/jira/browse/HDFS-13712
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.0
>Reporter: Gergo Repas
>Assignee: Gergo Repas
>Priority: Minor
> Fix For: 3.2.0, 3.1.1, 3.0.4
>
> Attachments: HDFS-13712.000.patch
>
>
> Logger.isTraceEnabled() shows up as a hot method via calls from 
> BlockReaderRemote.read(). The attached patch reduces the number of such calls 
> when trace-logging is turned off, and is on-par when the trace-logging is 
> turned on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13712) BlockReaderRemote.read() logging improvement

2018-07-02 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529970#comment-16529970
 ] 

Andrew Wang commented on HDFS-13712:


+1 LGTM pending Jenkins, thanks for filing this and posting a patch Gergo!

> BlockReaderRemote.read() logging improvement
> 
>
> Key: HDFS-13712
> URL: https://issues.apache.org/jira/browse/HDFS-13712
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.0
>Reporter: Gergo Repas
>Assignee: Gergo Repas
>Priority: Minor
> Attachments: HDFS-13712.000.patch
>
>
> Logger.isTraceEnabled() shows up as a hot method via calls from 
> BlockReaderRemote.read(). The attached patch reduces the number of such calls 
> when trace-logging is turned off, and is on-par when the trace-logging is 
> turned on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13712) BlockReaderRemote.read() logging improvement

2018-07-02 Thread Andrew Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13712:
---
Target Version/s: 3.2.0, 3.1.1, 3.0.4  (was: 3.2.0)

> BlockReaderRemote.read() logging improvement
> 
>
> Key: HDFS-13712
> URL: https://issues.apache.org/jira/browse/HDFS-13712
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.0
>Reporter: Gergo Repas
>Assignee: Gergo Repas
>Priority: Minor
> Attachments: HDFS-13712.000.patch
>
>
> Logger.isTraceEnabled() shows up as a hot method via calls from 
> BlockReaderRemote.read(). The attached patch reduces the number of such calls 
> when trace-logging is turned off, and is on-par when the trace-logging is 
> turned on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13702) Remove HTrace hooks from DFSClient to reduce CPU usage

2018-07-02 Thread Andrew Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13702:
---
   Resolution: Fixed
Fix Version/s: 3.0.4
   3.1.1
   3.2.0
   Status: Resolved  (was: Patch Available)

+1 LGTM as well, committing on behalf of Todd to trunk, branch-3.1, branch-3.0. 
Thanks for the patch Todd, and reviews from Stack and Steve. Let's take the 
continued tracing discussion to common-dev.

> Remove HTrace hooks from DFSClient to reduce CPU usage
> --
>
> Key: HDFS-13702
> URL: https://issues.apache.org/jira/browse/HDFS-13702
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.4
>
> Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch
>
>
> I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate 
> workload even when HTrace is disabled. This is because it stringifies several 
> integers. We should avoid all allocation and stringification when htrace is 
> disabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13702) Remove HTrace hooks from DFSClient to reduce CPU usage

2018-07-02 Thread Andrew Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13702:
---
Summary: Remove HTrace hooks from DFSClient to reduce CPU usage  (was: 
HTrace hooks taking 10-15% CPU in DFS client when disabled)

> Remove HTrace hooks from DFSClient to reduce CPU usage
> --
>
> Key: HDFS-13702
> URL: https://issues.apache.org/jira/browse/HDFS-13702
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hdfs-13702.patch, hdfs-13702.patch, hdfs-13702.patch
>
>
> I am seeing DFSClient.newReaderTraceScope take ~15% CPU in a teravalidate 
> workload even when HTrace is disabled. This is because it stringifies several 
> integers. We should avoid all allocation and stringification when htrace is 
> disabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13703) Avoid allocation of CorruptedBlocks hashmap when no corrupted blocks are hit

2018-07-02 Thread Andrew Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13703:
---
   Resolution: Fixed
Fix Version/s: 3.0.4
   3.1.1
   3.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-3.1, branch-3.0, thanks for working on this Todd!

> Avoid allocation of CorruptedBlocks hashmap when no corrupted blocks are hit
> 
>
> Key: HDFS-13703
> URL: https://issues.apache.org/jira/browse/HDFS-13703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: 3.2.0, 3.1.1, 3.0.4
>
> Attachments: hdfs-13703.patch, hdfs-13703.patch
>
>
> The DFSClient creates a CorruptedBlocks object, which contains a HashMap, on 
> every read call. In most cases, a read will not hit any corrupted blocks, and 
> this hashmap is not used. It seems the JIT isn't smart enough to eliminate 
> this allocation. We would be better off avoiding it and only allocating in 
> the rare case when a corrupt block is hit.
> Removing this allocation reduced CPU usage of a TeraValidate job by about 10%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13703) Avoid allocation of CorruptedBlocks hashmap when no corrupted blocks are hit

2018-07-02 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529643#comment-16529643
 ] 

Andrew Wang commented on HDFS-13703:


+1 LGTM, will commit shortly.

> Avoid allocation of CorruptedBlocks hashmap when no corrupted blocks are hit
> 
>
> Key: HDFS-13703
> URL: https://issues.apache.org/jira/browse/HDFS-13703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hdfs-13703.patch, hdfs-13703.patch
>
>
> The DFSClient creates a CorruptedBlocks object, which contains a HashMap, on 
> every read call. In most cases, a read will not hit any corrupted blocks, and 
> this hashmap is not used. It seems the JIT isn't smart enough to eliminate 
> this allocation. We would be better off avoiding it and only allocating in 
> the rare case when a corrupt block is hit.
> Removing this allocation reduced CPU usage of a TeraValidate job by about 10%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2018-06-15 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514115#comment-16514115
 ] 

Andrew Wang commented on HDFS-13671:


I'm fine with reverting if we're seeing production issues. I wasn't that 
involved with HDFS-9260 except to try and answer Daryn's questions about 
real-world performance.

Given that there seems to be a lot more interest in maintaining the older 
version, I'm also inclined to revert for maintenance purposes.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Priority: Major
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-31 Thread Andrew Wang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497016#comment-16497016
 ] 

Andrew Wang commented on HDFS-13616:


Would we be okay with making this a DistributedFileSystem-only change, with no 
changes to FileSystem? Right now non-DFS will just throw 
UnsupportedOperationException anyway, so it's specific to HDFS as it is.

The batch feed API seems reasonable but it does add a lot of scope. I'm not 
sure how to apply it to stream-oriented operations like reading and writing 
files, so it may only be usable for metadata operations. So if we're okay with 
a limited set of operations initially (say, just listing and delete), then I 
could look into it.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, 
> HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-25 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491400#comment-16491400
 ] 

Andrew Wang commented on HDFS-13616:


bq. if I pass a file (instead of a directory), will I get back a standalone 
PartialListing that only includes the FileStatus for that file?

Good question! I have a unit test that covers this too, it returns back a 
PartialListing with just the filestatus of the file, and getParent will return 
the file's path. This is the same behavior as listLocatedStatus.

This makes me realize though that "getParent" is not the best name since it 
won't always be the parent, maybe getSourcePath? getListedPath? Happy to take 
suggestions here, and yea I can beef up the documentation around this too.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, 
> HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-25 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491002#comment-16491002
 ] 

Andrew Wang commented on HDFS-13616:


Thanks for taking a look, Xiao and Aaron!

bq. We currently FNFE on the first error. Is it possible a partition is deleted 
while another thread is listing halfway for Hive/Impala? What's the expected 
behavior from them if so? (I'm lacking the knowledge of this so no strong 
preference either way, but curious...)

This case is somewhat addressed by the the unit test listSomeDoNotExist, you'll 
see that the get() method throws if there was an exception but you can still 
get results from other listing batches returned by the iterator.

If you're talking about listing a single large directory and the directory gets 
deleted during the listing, then yea this API will throw an FNFE like the 
existing RemoteIterator API. Paged listings aren't atomic.

bq. If caller added some subdirs to srcs, should we list the subdir twice, or 
throw, or 'smartly' list everything at most once?

This is addressed by the unit test listSamePaths, it lists it multiple times. I 
didn't see it as the role of the filesystem to coalesce these paths, 
semantically I wanted it to behave like the existing RemoteIterator 
API called in a for loop.

Aaron, I'll hit your review comments in a new patch rev. Precommit is getting 
pretty close, so I'm hoping to coalesce review comments from others before 
posting the next one.

bq. Why not just RemoteIterator?

We need an entry point to throw an exception for a single path that doesn't 
kill the entire listing. From a client POV, it's also nice to have the same 
path passed in provided back, since the HDFS returns back absolute, qualified 
paths. It also makes it easier to understand the empty directory case.

I attached the benchmark I ran for further examination. I think you correctly 
answered the usecase question yourself, but to confirm: the Hive/Impala client 
already has a list of leaf directories to list, so it'd require some 
contortions to use a recursive API like listFiles instead. I imagine a 
server-side listFiles (like what S3 has) would be a nice speedup though.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, 
> HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13616) Batch listing of multiple directories

2018-05-25 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13616:
---
Attachment: BenchmarkListFiles.java

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, 
> HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490068#comment-16490068
 ] 

Andrew Wang commented on HDFS-13616:


Latest patch addresses some precommit issues. As stated earlier, non-HDFS 
filesystems are going to throw UnsupportedOperationException. One correction to 
my earlier comment too, the default listing limit is 1000, not 100. 100 is the 
current default limit on the number of paths that can be listed per batched 
listing call.

Hi Nicholas, thanks for taking a look. Currently we don't see a need for API 
support beyond listing. The workload we're looking at is metadata loading for 
applications like Hive and Impala.

Regarding an async API, Todd's benchmarking shows that the batched API is more 
CPU efficient than processing individual listing calls. It beats the 5-thread 
case for sparse directories in CPU time and wall time. My benchmarking 
additionally shows that the batched API generates significantly less garbage.

This batched listing API could also be combined with an async API (or a thread 
pool), so it's not an "either or" situation.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch, HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13616:
---
Attachment: HDFS-13616.002.patch

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch, HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490032#comment-16490032
 ] 

Andrew Wang commented on HDFS-13616:


Hi Zhe, thanks for taking a look! This API respects the existing lsLimit 
setting of 100, and also limits the number of paths that can be listed in a 
single batch call. This means that the per-call overhead is very similar to the 
existing RemoteIterator calls when returning 100-item partial 
listings. Todd saw ~7ms RPC handling times for 100-item batches on a cluster, 
which feels like the right granularity for holding a read lock.

To answer Todd's question about benchmarking, I wrote a little unit test that 
invokes NameNodeRpcServer directly and times with System.nanotime(). I made a 
synthetic directory structure with 30,000 directories, each with one file. This 
makes it a best case scenario for the batched listing API. Precautions were 
taken to allow JVM warmup, I let the benchmarks run for about 30s before 
recording with JFR/JMC.

I was able to list 8.4x more LocatedFileStatuses/second with the batched 
listing. JMC showed a TLAB allocation rate of 5x. Non-TLAB allocation was 
trivial. This means we're much more CPU efficient per-FileStatus, and also 
doing less allocation.

Since this did not include RTT time or lock contention from concurrent threads, 
a more realistic benchmark might do even better. I think this explains the 
10-20x that Todd saw when benchmarking on a real cluster.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489568#comment-16489568
 ] 

Andrew Wang commented on HDFS-13616:


Thanks for providing the additional context Todd, agree on all points.

I'm happy to build a simple benchmark, let me start on that. I'm thinking of 
invoking the NN methods directly, since external measurement seems a little 
tricky, and I don't know how easy it is to hook in something like JMH. Getting 
a realistic measurement would require doing something multi-threaded on a 
server-class machine, but let's see how far I get on my laptop.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13616:
---
Status: Patch Available  (was: Open)

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489518#comment-16489518
 ] 

Andrew Wang commented on HDFS-13616:


Patch attached to get some initial feedback on the API and a precommit run on 
the patch.

I can add a naive default FileSystem implementation if desired, I don't have 
any other immediate TODOs in mind.

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13616:
---
Attachment: HDFS-13616.001.patch

> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13616.001.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-13616) Batch listing of multiple directories

2018-05-24 Thread Andrew Wang (JIRA)

Andrew Wang created HDFS-13616:
--

 Summary: Batch listing of multiple directories
 Key: HDFS-13616
 URL: https://issues.apache.org/jira/browse/HDFS-13616
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 3.2.0
Reporter: Andrew Wang
Assignee: Andrew Wang


One of the dominant workloads for external metadata services is listing of 
partition directories. This can end up being bottlenecked on RTT time when 
partition directories contain a small number of files. This is fairly common, 
since fine-grained partitioning is used for partition pruning by the query 
engines.

A batched listing API that takes multiple paths amortizes the RTT cost. Initial 
benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-24 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13611:
---
   Resolution: Fixed
Fix Version/s: 3.0.4
   3.1.1
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-3.1, branch-3.0, branch-2. Thanks for 
reviewing, Todd and Xiao, and Todd for the original spot.

> Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient
> ---
>
> Key: HDFS-13611
> URL: https://issues.apache.org/jira/browse/HDFS-13611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>
> Attachments: HDFS-13611.001.patch, HDFS-13611.002.patch
>
>
> Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
> it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487864#comment-16487864
 ] 

Andrew Wang commented on HDFS-13611:


Even simpler patch attached, if we make a copy of the Text for the CHM key and 
never mutate it, we can save calling toString on the hotpath.

> Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient
> ---
>
> Key: HDFS-13611
> URL: https://issues.apache.org/jira/browse/HDFS-13611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13611.001.patch, HDFS-13611.002.patch
>
>
> Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
> it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13611:
---
Attachment: HDFS-13611.002.patch

> Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient
> ---
>
> Key: HDFS-13611
> URL: https://issues.apache.org/jira/browse/HDFS-13611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13611.001.patch, HDFS-13611.002.patch
>
>
> Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
> it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487849#comment-16487849
 ] 

Andrew Wang commented on HDFS-13611:


Simple patch attached, getFixedByteString(Text) wasn't even useful anyway since 
it was just calling toString internally.

> Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient
> ---
>
> Key: HDFS-13611
> URL: https://issues.apache.org/jira/browse/HDFS-13611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13611.001.patch
>
>
> Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
> it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13611:
---
Affects Version/s: 3.0.4
   3.1.1
   3.2.0
   2.10.0
   Status: Patch Available  (was: Open)

> Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient
> ---
>
> Key: HDFS-13611
> URL: https://issues.apache.org/jira/browse/HDFS-13611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.2.0, 3.1.1, 3.0.4
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13611.001.patch
>
>
> Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
> it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13611:
---
Attachment: HDFS-13611.001.patch

> Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient
> ---
>
> Key: HDFS-13611
> URL: https://issues.apache.org/jira/browse/HDFS-13611
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13611.001.patch
>
>
> Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
> it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-13611) Unsafe use of Text as a ConcurrentHashMap key in PBHelperClient

2018-05-23 Thread Andrew Wang (JIRA)

Andrew Wang created HDFS-13611:
--

 Summary: Unsafe use of Text as a ConcurrentHashMap key in 
PBHelperClient
 Key: HDFS-13611
 URL: https://issues.apache.org/jira/browse/HDFS-13611
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang


Follow on to HDFS-13601, a bug spotted by [~tlipcon]: since Text is mutable, 
it's not safe to use as a hash map key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-23 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13601:
---
   Resolution: Fixed
Fix Version/s: 3.0.3
   3.1.1
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Thanks Xiao! I committed this to trunk, branch-3.1, branch-3.0, and branch-2. I 
think it's safe for the older branch-2's also if we want to do that. LMK.

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 3.0.3
>
> Attachments: HDFS-13601.001.patch, HDFS-13601.002.patch, 
> HDFS-13601.003.patch, HDFS-13601.004.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-23 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486777#comment-16486777
 ] 

Andrew Wang commented on HDFS-13601:


Thanks for reviewing Xiao! I think this current behavior is actually no 
different than the existing PBHelperClient, since it calls 
\{{setPoolId(b.getBlockPoolId())}} without a null check, and setPoolId will 
throw an NPE if the value is null. I can add the null check if you want, but 
since the behavior is the same as the existing code I think it's okay.

I also ran the failed tests locally and they passed, so I think they're flakes.

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch, HDFS-13601.002.patch, 
> HDFS-13601.003.patch, HDFS-13601.004.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-22 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486602#comment-16486602
 ] 

Andrew Wang commented on HDFS-13601:


I think this latest rev will pass precommit, and also incorporates Xiao's 
review comments.

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch, HDFS-13601.002.patch, 
> HDFS-13601.003.patch, HDFS-13601.004.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-22 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13601:
---
Attachment: HDFS-13601.004.patch

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch, HDFS-13601.002.patch, 
> HDFS-13601.003.patch, HDFS-13601.004.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-22 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13601:
---
Attachment: HDFS-13601.003.patch

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch, HDFS-13601.002.patch, 
> HDFS-13601.003.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-21 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13601:
---
Attachment: HDFS-13601.002.patch

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch, HDFS-13601.002.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-21 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483210#comment-16483210
 ] 

Andrew Wang commented on HDFS-13601:


I've attached a patch for some flavor and a precommit run. The basic idea is to 
cache what are likely to be fixed strings, or strings coming from a limited set.

I tested this on CDH 5, but the same findings should also apply to trunk. For a 
pure listing-with-locations workload, JMC shows a reduction of TLAB allocation 
from 499MB/s to 384MB/s after applying this patch. Previously, 13% of stacks 
showed up in StringEncoder.encode (converting from String to byte array for 
PB), and now that's reduced to 5.6%. The hotspot is now creating the 
LocatedBlocks and adding all the StorageIDs, which is something separate to 
tackle.

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-21 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13601:
---
Status: Patch Available  (was: Open)

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.1, 3.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-21 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-13601:
---
Attachment: HDFS-13601.001.patch

> Optimize ByteString conversions in PBHelper
> ---
>
> Key: HDFS-13601
> URL: https://issues.apache.org/jira/browse/HDFS-13601
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Major
> Attachments: HDFS-13601.001.patch
>
>
> While doing some profiling of the NN with JMC, I saw a lot of time being 
> spent on String->ByteString conversions. These are often the same strings 
> being converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-13601) Optimize ByteString conversions in PBHelper

2018-05-21 Thread Andrew Wang (JIRA)

Andrew Wang created HDFS-13601:
--

 Summary: Optimize ByteString conversions in PBHelper
 Key: HDFS-13601
 URL: https://issues.apache.org/jira/browse/HDFS-13601
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.9.1, 3.1.0
Reporter: Andrew Wang
Assignee: Andrew Wang


While doing some profiling of the NN with JMC, I saw a lot of time being spent 
on String->ByteString conversions. These are often the same strings being 
converted over and over again, meaning there's room for optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13351) Revert HDFS-11156 from branch-2/branch-2.8

2018-03-27 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416028#comment-16416028
 ] 

Andrew Wang commented on HDFS-13351:


+1 pending

> Revert HDFS-11156 from branch-2/branch-2.8
> --
>
> Key: HDFS-13351
> URL: https://issues.apache.org/jira/browse/HDFS-13351
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: webhdfs
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: HDFS-13351-branch-2.001.patch
>
>
> Per discussion in HDFS-11156, lets revert the change from branch-2 and 
> branch-2.8. New patch can be tracked in HDFS-12459 .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11156) Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API

2018-03-26 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-11156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414187#comment-16414187
 ] 

Andrew Wang commented on HDFS-11156:


Sure, let's revert. I probably just missed it while pushing all the reverts. 
Thanks for being on top of this!

> Add new op GETFILEBLOCKLOCATIONS to WebHDFS REST API
> 
>
> Key: HDFS-11156
> URL: https://issues.apache.org/jira/browse/HDFS-11156
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.3
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: 3.0.0-alpha2
>
> Attachments: BlockLocationProperties_JSON_Schema.jpg, 
> BlockLocations_JSON_Schema.jpg, FileStatuses_JSON_Schema.jpg, 
> HDFS-11156-branch-2.01.patch, HDFS-11156.01.patch, HDFS-11156.02.patch, 
> HDFS-11156.03.patch, HDFS-11156.04.patch, HDFS-11156.05.patch, 
> HDFS-11156.06.patch, HDFS-11156.07.patch, HDFS-11156.08.patch, 
> HDFS-11156.09.patch, HDFS-11156.10.patch, HDFS-11156.11.patch, 
> HDFS-11156.12.patch, HDFS-11156.13.patch, HDFS-11156.14.patch, 
> HDFS-11156.15.patch, HDFS-11156.16.patch, Output_JSON_format_v10.jpg, 
> SampleResponse_JSON.jpg
>
>
> Following webhdfs REST API
> {code}
> http://:/webhdfs/v1/?op=GET_BLOCK_LOCATIONS=0=1
> {code}
> will get a response like
> {code}
> {
>   "LocatedBlocks" : {
> "fileLength" : 1073741824,
> "isLastBlockComplete" : true,
> "isUnderConstruction" : false,
> "lastLocatedBlock" : { ... },
> "locatedBlocks" : [ {...} ]
>   }
> }
> {code}
> This represents for *o.a.h.h.p.LocatedBlocks*. However according to 
> *FileSystem* API, 
> {code}
> public BlockLocation[] getFileBlockLocations(Path p, long start, long len)
> {code}
> clients would expect an array of BlockLocation. This mismatch should be 
> fixed. Marked as Incompatible change as this will change the output of the 
> GET_BLOCK_LOCATIONS API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11600) Refactor TestDFSStripedOutputStreamWithFailure test classes

2018-02-09 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359055#comment-16359055
 ] 

Andrew Wang commented on HDFS-11600:


Sure, go for it Sammi :)

> Refactor TestDFSStripedOutputStreamWithFailure test classes
> ---
>
> Key: HDFS-11600
> URL: https://issues.apache.org/jira/browse/HDFS-11600
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 3.0.0-alpha2
>Reporter: Andrew Wang
>Priority: Minor
> Attachments: HDFS-11600-1.patch, HDFS-11600.002.patch
>
>
> TestDFSStripedOutputStreamWithFailure has a great number of subclasses. The 
> tests are parameterized based on the name of these subclasses.
> Seems like we could parameterize these tests with JUnit and then not need all 
> these separate test classes.
> Another note, the tests will randomly return instead of running the test. 
> Using {{Assume}} instead would make it more clear in the test output that 
> these tests were skipped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13009) Creation of Encryption zone should succeed even if directory is not empty.

2018-01-22 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335150#comment-16335150
 ] 

Andrew Wang commented on HDFS-13009:


Thanks for the reply Daryn. If this option is only intended for encryption of 
new data, then this makes sense.

The current limitation is only a policy thing, and I think it'd be fine to have 
a flag that makes EZs behave like storage policies or EC policies. There are 
potential touchpoints with the rename restrictions and reencrypt EDEK 
functionality, so let's make sure to cover those in the test suite.

I think it'd also be useful to have a way of checking if all the data in an EZ 
is encrypted and "finalizing" it. This makes it easy for users to understand 
when 100% of the data is encrypted, which I assume is the end goal even with 
the retention policy. For example, a flag on the EZ xattr while it's in mixed 
mode, and after iterating it, removing the flag to indicate it's fully 
encrypted.

> Creation of Encryption zone should succeed even if directory is not empty.
> --
>
> Key: HDFS-13009
> URL: https://issues.apache.org/jira/browse/HDFS-13009
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: encryption
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Major
>
> Currently we have a restriction that creation of encryption zone can be done 
> only on an empty directory.
> This jira is to remove that restriction.
> Motivation:
> New customers who wants to start using Encryption zone can make an existing 
> directory encrypted.
> They will be able to read the old data as it is  and will be decrypting the 
> newly written data.
> Internally we have many customers asking for this feature.
> Currently they have to ask for more space quota, encrypt the old data.
> This will make their life much more easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13009) Creation of Encryption zone should succeed even if directory is not empty.

2018-01-22 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334973#comment-16334973
 ] 

Andrew Wang commented on HDFS-13009:


Hi Rushabh,

The original design intent of the zone was to make the security properties 
easier to reason about, since the entire directory is encrypted, and with the 
same encryption key. Many of our security-conscious users want everything in 
HDFS encrypted, and the presence of any unencrypted data would be a compliance 
issue. So, I don't think we can change the default semantics of the zone, 
though possibly we could add a flag or new concept to support the usecase you 
describe.

IIUC, the motivation is to make the initial encryption process easier, with the 
goal of encrypting everything within the directory? In any case, the encryption 
of existing data still happens via copies which might blow quotas. I think this 
change helps with encrypting the newly written data, but not that much with the 
quota problem when converting existing data.

> Creation of Encryption zone should succeed even if directory is not empty.
> --
>
> Key: HDFS-13009
> URL: https://issues.apache.org/jira/browse/HDFS-13009
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: encryption
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Major
>
> Currently we have a restriction that creation of encryption zone can be done 
> only on an empty directory.
> This jira is to remove that restriction.
> Motivation:
> New customers who wants to start using Encryption zone can make an existing 
> directory encrypted.
> They will be able to read the old data as it is  and will be decrypting the 
> newly written data.
> Internally we have many customers asking for this feature.
> Currently they have to ask for more space quota, encrypt the old data.
> This will make their life much more easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11942) make new chooseDataNode policy work in more operation like seek, fetch

2017-12-13 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11942:
---
Fix Version/s: (was: 3.0.0)
   3.0.1

> make new  chooseDataNode policy  work in more operation like seek, fetch
> 
>
> Key: HDFS-11942
> URL: https://issues.apache.org/jira/browse/HDFS-11942
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.0, 3.0.0-alpha3
>Reporter: Fangyuan Deng
> Fix For: 3.0.1
>
> Attachments: HDFS-11942.0.patch, HDFS-11942.1.patch, 
> ssd-first-disable(default).png, ssd-first-enable.png
>
>
> in default policy, if a file is ONE_SSD, client will prior read the local 
> disk replica rather than the remote ssd replica.
> but now, the pci-e SSD and 10G ethernet make remote read SSD more faster than 
>  the local disk.
> HDFS-9666 give us a patch,  but the code is not complete and not updated for 
> a long time.
> this sub-task issue give a complete patch and 
> we have tested on three machines [ 32 core cpu, 128G mem , 1000M network, 
> 1.2T HDD, 800G SSD(intel P3600) ].
> with this feather, throughput of hbase table(ONE_SSD) is double of which 
> without this feather



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11201) Spelling errors in the logging, help, assertions and exception messages

2017-12-13 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11201:
---
Fix Version/s: (was: 3.0.0)
   3.0.1

> Spelling errors in the logging, help, assertions and exception messages
> ---
>
> Key: HDFS-11201
> URL: https://issues.apache.org/jira/browse/HDFS-11201
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, diskbalancer, httpfs, namenode, nfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Grant Sohn
>Assignee: Grant Sohn
>Priority: Trivial
> Fix For: 3.0.1
>
> Attachments: HDFS-11201.1.patch, HDFS-11201.2.patch, 
> HDFS-11201.3.patch, HDFS-11201.4.patch
>
>
> Found a set of spelling errors in the user-facing code.
> Examples are:
> odlest -> oldest
> Illagal -> Illegal
> bounday -> boundary



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12920) HDFS default value change (with adding time unit) breaks old version MR tarball work with new version (3.0) of hadoop

2017-12-12 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12920:
---
Target Version/s: 3.0.1  (was: 3.0.0)

> HDFS default value change (with adding time unit) breaks old version MR 
> tarball work with new version (3.0) of hadoop
> -
>
> Key: HDFS-12920
> URL: https://issues.apache.org/jira/browse/HDFS-12920
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Junping Du
>Priority: Blocker
>
> After HADOOP-15059 get resolved. I tried to deploy 2.9.0 tar ball with 3.0.0 
> RC1, and run the job with following errors:
> {noformat}
> 2017-12-12 13:29:06,824 INFO [main] 
> org.apache.hadoop.service.AbstractService: Service 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.NumberFormatException: For input string: "30s"
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.NumberFormatException: For input string: "30s"
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:542)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:522)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1764)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:522)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:308)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1722)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1719)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1650)
> {noformat}
> This is because HDFS-10845, we are adding time unit to hdfs-default.xml but 
> it cannot be recognized by old version MR jars. 
> This break our rolling upgrade story, so should mark as blocker.
> A quick workaround is to add values in hdfs-site.xml with removing all time 
> unit. But the right way may be to revert HDFS-10845 (and get rid of noisy 
> warnings).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12920) HDFS default value change (with adding time unit) breaks old version MR tarball work with new version (3.0) of hadoop

2017-12-12 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288574#comment-16288574
 ] 

Andrew Wang commented on HDFS-12920:


Particularly since there's a workaround, let's bump this to 3.0.1.

> HDFS default value change (with adding time unit) breaks old version MR 
> tarball work with new version (3.0) of hadoop
> -
>
> Key: HDFS-12920
> URL: https://issues.apache.org/jira/browse/HDFS-12920
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Junping Du
>Priority: Blocker
>
> After HADOOP-15059 get resolved. I tried to deploy 2.9.0 tar ball with 3.0.0 
> RC1, and run the job with following errors:
> {noformat}
> 2017-12-12 13:29:06,824 INFO [main] 
> org.apache.hadoop.service.AbstractService: Service 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.NumberFormatException: For input string: "30s"
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.NumberFormatException: For input string: "30s"
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:542)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:522)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1764)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:522)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:308)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1722)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1719)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1650)
> {noformat}
> This is because HDFS-10845, we are adding time unit to hdfs-default.xml but 
> it cannot be recognized by old version MR jars. 
> This break our rolling upgrade story, so should mark as blocker.
> A quick workaround is to add values in hdfs-site.xml with removing all time 
> unit. But the right way may be to revert HDFS-10845 (and get rid of noisy 
> warnings).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12907) Allow read-only access to reserved raw for non-superusers

2017-12-11 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287009#comment-16287009
 ] 

Andrew Wang commented on HDFS-12907:


xattr change makes sense based on the scope of this change.

We should also still validate the case that users that don't have read access 
can't access the raw xattrs, if we aren't.

> Allow read-only access to reserved raw for non-superusers
> -
>
> Key: HDFS-12907
> URL: https://issues.apache.org/jira/browse/HDFS-12907
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Rushabh S Shah
> Attachments: HDFS-12907.001.patch, HDFS-12907.002.patch, 
> HDFS-12907.patch
>
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12813) RequestHedgingProxyProvider can hide Exception thrown from the Namenode for proxy size of 1

2017-12-08 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12813:
---
Fix Version/s: (was: 3.0.0)
   3.1.0

> RequestHedgingProxyProvider can hide Exception thrown from the Namenode for 
> proxy size of 1
> ---
>
> Key: HDFS-12813
> URL: https://issues.apache.org/jira/browse/HDFS-12813
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: 3.1.0, 2.10.0
>
> Attachments: HDFS-12813.001.patch, HDFS-12813.002.patch, 
> HDFS-12813.003.patch, HDFS-12813.004.patch
>
>
> HDFS-11395 fixed the problem where the MultiException thrown by 
> RequestHedgingProxyProvider was hidden. However when the target proxy size is 
> 1, then unwrapping is not done for the InvocationTargetException. for target 
> proxy size of 1, the unwrapping should be done till first level where as for 
> multiple proxy size, it should be done at 2 levels.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12814) Add blockId when warning slow mirror/disk in BlockReceiver

2017-12-08 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12814:
---
Fix Version/s: (was: 3.0.0)
   3.0.1
   3.1.0

> Add blockId when warning slow mirror/disk in BlockReceiver
> --
>
> Key: HDFS-12814
> URL: https://issues.apache.org/jira/browse/HDFS-12814
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Trivial
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12814.001.patch, HDFS-12814.002.patch
>
>
> HDFS-11603 add downstream DataNodeIds and volume path.
> In order to better debug， those warnning log should include blockId



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12907) Allow read-only access to reserved raw for non-superusers

2017-12-07 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282579#comment-16282579
 ] 

Andrew Wang commented on HDFS-12907:


Solving this for WebHdfsFileSystem is a lot more tractable than Hue, so this 
makes sense to me.

FYI [~xiaochen] also, since he's a KMS expert and was also involved in the 
internal discussions.

> Allow read-only access to reserved raw for non-superusers
> -
>
> Key: HDFS-12907
> URL: https://issues.apache.org/jira/browse/HDFS-12907
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Rushabh S Shah
> Attachments: HDFS-12907.patch
>
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12907) Allow read-only access to reserved raw for non-superusers

2017-12-07 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282487#comment-16282487
 ] 

Andrew Wang commented on HDFS-12907:


I think that would work, though of course I'd prefer not to open up internal 
state representation if it can be avoided.

On the topic of webhdfs client-side encryption, could you talk a little more 
about your usecase? We discussed this internally before in the context of Hue, 
and there didn't seem to be a great solution. They have a very simple Python 
WebHDFS client built around effectively curl, and they'd need to add their own 
KMS client and encryption routines. Really though, we'd want to move this all 
the way to the browser, and write the KMS client and encryption routines in 
Javascript. Ouch.

A way of scoping the KMS delegation token to limit what keys could be accessed 
would also be an improvement, e.g. a "key token" similar to the HDFS block 
token. It addresses some of the issues with webhdfs and encryption.

> Allow read-only access to reserved raw for non-superusers
> -
>
> Key: HDFS-12907
> URL: https://issues.apache.org/jira/browse/HDFS-12907
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Rushabh S Shah
> Attachments: HDFS-12907.patch
>
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-7240) Object store in HDFS

2017-12-07 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282441#comment-16282441
 ] 

Andrew Wang commented on HDFS-7240:
---

Hi Sanjay,

Thanks for writing up that summary. It's clear there's still disagreement on 
the merge. How should we proceed on reaching consensus? On the last call you 
suggested making a document, or we could do another call.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12907) Allow read-only access to reserved raw for non-superusers

2017-12-07 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282250#comment-16282250
 ] 

Andrew Wang commented on HDFS-12907:


It's so they don't accidentally write data without xattrs or with the wrong 
xattrs, which would be essentially corrupt. We also don't want plaintext 
getting written accidentally.

The NameNode does a fair amount of work at create time to provision an EDEK for 
the file. This logic is beyond the ability of most clients.

> Allow read-only access to reserved raw for non-superusers
> -
>
> Key: HDFS-12907
> URL: https://issues.apache.org/jira/browse/HDFS-12907
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>Assignee: Rushabh S Shah
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11576) Block recovery will fail indefinitely if recovery time > heartbeat interval

2017-12-07 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11576:
---
Fix Version/s: 3.0.0

> Block recovery will fail indefinitely if recovery time > heartbeat interval
> ---
>
> Key: HDFS-11576
> URL: https://issues.apache.org/jira/browse/HDFS-11576
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs, namenode
>Affects Versions: 2.7.1, 2.7.2, 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2
>Reporter: Lukas Majercak
>Assignee: Lukas Majercak
>Priority: Critical
> Fix For: 3.0.0, 2.9.1
>
> Attachments: HDFS-11576-branch-2.00.patch, 
> HDFS-11576-branch-2.01.patch, HDFS-11576.001.patch, HDFS-11576.002.patch, 
> HDFS-11576.003.patch, HDFS-11576.004.patch, HDFS-11576.005.patch, 
> HDFS-11576.006.patch, HDFS-11576.007.patch, HDFS-11576.008.patch, 
> HDFS-11576.009.patch, HDFS-11576.010.patch, HDFS-11576.011.patch, 
> HDFS-11576.012.patch, HDFS-11576.013.patch, HDFS-11576.014.patch, 
> HDFS-11576.015.patch, HDFS-11576.repro.patch
>
>
> Block recovery will fail indefinitely if the time to recover a block is 
> always longer than the heartbeat interval. Scenario:
> 1. DN sends heartbeat 
> 2. NN sends a recovery command to DN, recoveryID=X
> 3. DN starts recovery
> 4. DN sends another heartbeat
> 5. NN sends a recovery command to DN, recoveryID=X+1
> 6. DN calls commitBlockSyncronization after succeeding with first recovery to 
> NN, which fails because X < X+1
> ... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12907) Allow read-only access to reserved raw for non-superusers

2017-12-06 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281129#comment-16281129
 ] 

Andrew Wang commented on HDFS-12907:


SGTM, we locked it down originally since we didn't know of a usecase besides 
distcp (which often runs as a superuser). The contents of the FEInfo are 
already accessible to anyone who has permissions to read the file if they write 
a custom DFSClient.

> Allow read-only access to reserved raw for non-superusers
> -
>
> Key: HDFS-12907
> URL: https://issues.apache.org/jira/browse/HDFS-12907
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Daryn Sharp
>
> HDFS-6509 added a special /.reserved/raw path prefix to access the raw file 
> contents of EZ files.  In the simplest sense it doesn't return the FE info in 
> the {{LocatedBlocks}} so the dfs client doesn't try to decrypt the data.  
> This facilitates allowing tools like distcp to copy raw bytes.
> Access to the raw hierarchy is restricted to superusers.  This seems like an 
> overly broad restriction designed to prevent non-admins from munging the EZ 
> related xattrs.  I believe we should relax the restriction to allow 
> non-admins to perform read-only operations.  Allowing non-superusers to 
> easily read the raw bytes will be extremely useful for regular users, esp. 
> for enabling webhdfs client-side encryption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-12-05 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279550#comment-16279550
 ] 

Andrew Wang commented on HDFS-10285:


Thanks for chiming in Daryn,

bq. My preference is this feature, like all scan features, should be outside 
the NN. Integrated functionality is arguably more user-friendly but it comes 
with its own costs. Namely increased complexity and maintenance. It's yet 
another feature to accommodate in future core features.

Keeping it in the NameNode is easier from a deployment standpoint. It's 
arguable whether this benefit is more important than the benefits to making it 
separate.

I'm coming at this from the standpoint of supporting Cloudera's Hadoop 
customers. For a large, sophisticated Hadoop user like Yahoo, it may not be a 
big cost to deploy a new service, but in relative terms a much bigger cost for 
a small user. Being able to reach in and kill a rogue process or iteratively 
test new versions is great when you're a power user, but not for the average 
Hadoop admin who wants this to be turnkey. You'd be amazed at the cluster-write 
support tickets we've resolved by saying "run the balancer", just because it 
doesn't run automatically. I've fielded similar questions about HSM that were 
answered by "run the mover". It's the first thing users trip over.

Replying to the other concerns, we already have mechanisms for reconfiguring 
the NN so I don't see that as an inherent limitation. Running on a precisely 
scheduled basis also doesn't seem inherent, and also isn't what Anu was 
proposing since the SPS would still be triggered by a NN RPC, not by cron or 
something.

Finally, the SPS is off by default, and pretty safe since the new code sits 
separate from the rest of the NN paths. There's also already a separate mover 
command which runs like the balancer, for users who prefer that.

Are there still outstanding concerns with merging this? Uma proposed a call 
above, and I think that's the next step if we still need to reach consensus.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12840) Creating a file with non-default EC policy in a EC zone is not correctly serialized in the editlog

2017-12-04 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12840:
---
Summary: Creating a file with non-default EC policy in a EC zone is not 
correctly serialized in the editlog  (was: Creating a file with non-default EC 
policy in a EC zone does not correctly serialized in EditLogs)

> Creating a file with non-default EC policy in a EC zone is not correctly 
> serialized in the editlog
> --
>
> Key: HDFS-12840
> URL: https://issues.apache.org/jira/browse/HDFS-12840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12840.00.patch, HDFS-12840.01.patch, 
> HDFS-12840.02.patch, HDFS-12840.03.patch, HDFS-12840.04.patch, 
> HDFS-12840.reprod.patch, editsStored, editsStored, editsStored.03
>
>
> When create a replicated file in an existing EC zone, the edit logs does not 
> differentiate it from an EC file. When {{FSEditLogLoader}} to replay edits, 
> this file is treated as EC file, as a results, it crashes the NN because the 
> blocks of this file are replicated, which does not match with {{INode}}.
> {noformat}
> ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered 
> exception on operation AddBlockOp [path=/system/balancer.id, 
> penultimateBlock=NULL, lastBlock=blk_1073743259_2455, RpcClientId=, 
> RpcCallId=-2]
> java.lang.IllegalArgumentException: reportedBlock is not striped
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.addStorage(BlockInfoStriped.java:118)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.addBlock(DatanodeStorageInfo.java:256)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:3141)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlockUnderConstruction(BlockManager.java:3068)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:3864)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processQueuedMessages(BlockManager.java:2916)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processQueuedMessagesForBlock(BlockManager.java:2903)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.addNewBlock(FSEditLogLoader.java:1069)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:532)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-12-01 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275034#comment-16275034
 ] 

Andrew Wang commented on HDFS-10285:


Hi Anu, thanks for the prompt responses,

bq. Yes, [ZK] would be the simplest approach to getting SPS HA.

Could you describe this plan in more detail? ZK doesn't solve the problems of 
HA by itself. We still need to think about idempotency. Does it require ZKFCs? 
I want to emphasize again the operational complexity that comes from adding 
more daemons and processes. It's a big knock on the ease of use of HDFS right 
now.

All of this adds significant complexity to deploying this feature. Adding 
another ZK dependency to HDFS is also undesirable from my POV. ZK is used 
instead of QJM for NN leader election for legacy reasons. It'd be better to 
drop the ZK dependency from HDFS entirely.

bq. Once the active knows it is the leader, it can read the state from NN and 
continue. The issues of continuity are exactly same whether it is inside NN or 
outside.

Does this involve rescanning a significant portion of the namespace? 
Synchronizing state over an RPC boundary (which can fail) is also more 
complicated than going in-memory. We've also already got mechanisms in place 
for safely synchronizing namespace and block state between NNs.

bq. As soon as a block is moved, the move call updates the status of the block 
move, that is NN is up to date with that info. Each time there is a call to SPS 
API, NN will keep track of it and the updates after move lets us filter the 
remaining blocks.

Is an edit log update on every block move? That would be a lot of overhead, 
particularly since we don't persist block locations in HDFS right now.

bq. By that argument, Balancer should be the first tool that move into the 
Namenode and then DiskBalancer. Right now, SPS approach follows what we are 
doing in HDFS world, that is block moves are achieved thru an async mechanism. 
If you would like to provide a generic block mover mechanism in Namenode and 
then port balancer and diskBalancer, you are most welcome. I will be glad to 
move SPS to that framework when we have it.

The existing code being bad isn't a good reason to make it worse. I remember 
that the original motivation for the SPS was to reduce the deployment and 
operational complexity of running the balancer and mover. Making it a separate 
process again means we lose those benefits.

bq. There are a couple of concerns: 

I don't agree with #1 for the reason stated above. The DiskBalancer is fine 
since it's local to one DN, but the Balancer and Mover circumventing global 
coordination is an anti-pattern IMO.

Regarding #2, in my previous comment, I provided a number of tasks that are 
performed by the SPS-in-NN. Could you point to which of these are offloaded 
from the NN by having the SPS as a separate service? Even a separate-service 
SPS still adds NN memory and CPU overhead. Also, as I said in my previous 
comment, marshalling and unmarshalling over an RPC interface is less efficient 
than scanning these NN data structures in-process.

#3, I don't follow how SSM or provided block storage benefit from SPS as a 
service vs. being part of the NN. If there are design docs for these 
interactions, I would appreciate some references.

bq. And most important, we are just accelerating an SPS future work item, it 
has been a booked plan to make SPS separate,

Where is this plan described and motivated? The design doc from last month 
talks about the SPS as a daemon thread in the NN.

It'd help to write up a more detailed design doc for review by the watchers on 
this JIRA. Making it a new service sounds like a big effort on top of what has 
already worked on.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user

[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-12-01 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274950#comment-16274950
 ] 

Andrew Wang commented on HDFS-10285:


Is it trivial? I think we still need some type of fencing so there's only one 
active SPS. Does this use zookeeper, like NN HA? If there's an SPS failover, 
how does the new active know where to resume? I'm also wondering how progress 
is tracked, so we can resume without iterating over significant portions of the 
namespace. This also relates to my request for a "-w" flag; really we'd like 
more granular progress tracking if possible.

This was not simple to do for EDEK re-encryption, and it gets more complicated 
when state needs to be passed across an RPC boundary and in a stateless service.

I also like centralized control when it comes to coordinating block work. The 
NN schedules and prioritizes block work on the cluster. Already it's annoying 
to users to have configure a separate set of resource throttles for the 
balancer work, and it makes the system less reactive to cluster health events. 
We'd much rather have a single resource allocation for all cluster maintenance 
work, which the NN can use however it wants based on its priority.

What is the concern about NN overhead, for this feature in particular? This is 
similar to what I asked Uma earlier about the coordinator DN; I don't think it 
meaningfully shifts work off the NN. We iterate the block map already to do 
tasks like decommissioning, and iterating over RPC instead is more overhead and 
adds complexity. The NN is still processing IBRs in any case, and it sounds 
like it's also still responsible for edit log ops for persistence of ongoing 
requests and progress.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2017-12-01 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274838#comment-16274838
 ] 

Andrew Wang commented on HDFS-10285:


How will clients find this new service? Is it highly available? How does it 
interact with rolling upgrade?

Adding a new service is not something to do lightly.

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-SPS-TestReport-20170708.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12872) EC Checksum broken when BlockAccessToken is enabled

2017-11-30 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273066#comment-16273066
 ] 

Andrew Wang commented on HDFS-12872:


If we can get it in before the other blockers are resolved, I'm fine with 
including this.

> EC Checksum broken when BlockAccessToken is enabled
> ---
>
> Key: HDFS-12872
> URL: https://issues.apache.org/jira/browse/HDFS-12872
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12872.repro.patch
>
>
> It appears {{hdfs ec -checksum}} doesn't work when block access token is 
> enabled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263261#comment-16263261
 ] 

Andrew Wang commented on HDFS-12754:


Agree, please get it into branch-3.0.0 as well since we're rolling a new RC.

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.1.0
>
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-7240) Object store in HDFS

2017-11-17 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257730#comment-16257730
 ] 

Andrew Wang commented on HDFS-7240:
---

Some Hortonworkers and Clouderans met yesterday, here are my meeting notes. I 
wanted to get them up before the broader meeting today. I already sent these 
around to the attendees, but please comment if I got anything incorrect.

Attendees: ATM, Andrew, Anu, Aaron Fabbri, Jitendra, Sanjay, other listeners on 
the phone

High-level questions raised:

* Wouldn't Ozone be better off as a separate project?
* Why should it be merged now?

Things we agree on:

* We're all on Team Ozone, and applaud any effort to address scaling HDFS.
* There are benefits to Ozone being a separate project. Can release faster, 
iterate more quickly on feedback, and mature without having to worry about 
features like high-availability, security, encryption, etc. that not all 
customers need.
* No agreement on whether the benefits of separation outweigh the downsides.

Discussion:

* Anu: Don't want to have this separate since it confuses people about the 
long-term vision of Ozone. It's intended as block management for HDFS.
* Andrew: In its current state, Ozone cannot be plugged into the NN as the 
BM layer, so it seems premature to merge. Can't benefit existing users, and 
they can't test it.
* Response: The Ozone block layer is at a good integration point, and we 
want to move onto the NameNode changes like splitting the FSN/BM lock.
* Andrew: We can do the FSN/BM lock split without merging Ozone. Separate 
efforts. This lock split is also a major effort by itself, and is a dangerous 
change. It's something that should be baked in production.
* Sanjay: Ozone developers "willing to take the hit" of the slow Hadoop release 
cadence. Want to make this part of HDFS since it's easier for users to test and 
consume without installing a new cluster.
* ATM: Can still share the same hardware, and run the Ozone daemons 
alongside.
* Sanjay: Want to keep Ozone block management inside the Datanode process to 
enable a fast-copy between HDFS and Ozone. Not all data needs all the HDFS 
features like encryption, erasure coding, etc, and this data could be stored in 
Ozone.
* Andrew: This fast-copy hasn't been implemented or discussed yet. Unclear 
if it'll work at all with existing HDFS block management. Won't work with 
encryption or erasure coding. Not clear whether it requires being in the same 
DN process even.
* Sanjay/Anu: Ozone is also useful to test with just the key-value interface. 
It's a Hadoop-compatible FileSystem, so apps that work on S3 will work on Ozone 
too.
* Andrew: If it provides a new API and doesn't support the HDFS 
feature-set, doesn't this support it being its own project?

Summary

* No consensus on the high-level questions raised
* Ozone could be its own project and integrated later, or remain on an HDFS 
branch
* Without the FSN/BM lock split, it can't serve as the block management layer 
for HDFS
* Without fast copy, there's no need for the to be part of the DataNode 
process, and it might not need to be in the same process anyway.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12257) Expose getSnapshottableDirListing as a public API in HdfsAdmin

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12257:
---
Target Version/s: 2.8.3, 3.1.0, 2.9.1  (was: 2.8.3, 3.0.0, 2.9.1)

> Expose getSnapshottableDirListing as a public API in HdfsAdmin
> --
>
> Key: HDFS-12257
> URL: https://issues.apache.org/jira/browse/HDFS-12257
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
> Attachments: HDFS-12257.001.patch, HDFS-12257.002.patch, 
> HDFS-12257.003.patch
>
>
> Found at HIVE-16294. We have a CLI API for listing snapshottable dirs, but no 
> programmatic API. Other snapshot APIs are exposed in HdfsAdmin, I think we 
> should expose listing there as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12654:
---
Target Version/s: 3.1.0, 2.9.1, 2.8.4  (was: 3.0.0, 2.9.1, 2.8.4)

> APPEND API call is different in HTTPFS and NameNode REST
> 
>
> Key: HDFS-12654
> URL: https://issues.apache.org/jira/browse/HDFS-12654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, httpfs, namenode
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1
>Reporter: Andras Czesznak
>
> The APPEND REST API call behaves differently in the NameNode REST and the 
> HTTPFS codes. The NameNode version creates the target file the new data being 
> appended to if it does not exist at the time of the call issued. The HTTPFS 
> version assumes the target file exists when APPEND is called and can append 
> only the new data but does not create the target file it doesn't exist.
> The two implementations should be standardized, preferably the HTTPFS version 
> should be modified to execute an implicit CREATE if the target file does not 
> exist.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11885) createEncryptionZone should not block on initializing EDEK cache

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11885:
---
Target Version/s: 2.8.3, 3.1.0, 2.9.1  (was: 2.8.3, 3.0.0, 2.9.1)

> createEncryptionZone should not block on initializing EDEK cache
> 
>
> Key: HDFS-11885
> URL: https://issues.apache.org/jira/browse/HDFS-11885
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.5
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-11885.001.patch, HDFS-11885.002.patch, 
> HDFS-11885.003.patch, HDFS-11885.004.patch
>
>
> When creating an encryption zone, we call {{ensureKeyIsInitialized}}, which 
> calls {{provider.warmUpEncryptedKeys(keyName)}}. This is a blocking call, 
> which attempts to fill the key cache up to the low watermark.
> If the KMS is down or slow, this can take a very long time, and cause the 
> createZone RPC to fail with a timeout.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12654:
---
Target Version/s: 3.0.0, 2.9.1, 2.8.4  (was: 2.6.0, 2.7.0, 2.8.0, 3.0.0)

> APPEND API call is different in HTTPFS and NameNode REST
> 
>
> Key: HDFS-12654
> URL: https://issues.apache.org/jira/browse/HDFS-12654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, httpfs, namenode
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1
>Reporter: Andras Czesznak
>
> The APPEND REST API call behaves differently in the NameNode REST and the 
> HTTPFS codes. The NameNode version creates the target file the new data being 
> appended to if it does not exist at the time of the call issued. The HTTPFS 
> version assumes the target file exists when APPEND is called and can append 
> only the new data but does not create the target file it doesn't exist.
> The two implementations should be standardized, preferably the HTTPFS version 
> should be modified to execute an implicit CREATE if the target file does not 
> exist.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12714) Hadoop 3 missing fix for HDFS-5169

2017-11-13 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12714:
---
Fix Version/s: (was: 3.0.0-beta1)
   3.0.0

> Hadoop 3 missing fix for HDFS-5169
> --
>
> Key: HDFS-12714
> URL: https://issues.apache.org/jira/browse/HDFS-12714
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.0.0-alpha1, 3.0.0-beta1, 3.0.0-alpha2, 3.0.0-alpha4, 
> 3.0.0-alpha3
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HDFS-12714.001.patch
>
>
> HDFS-5169 is a fix for a null pointer dereference in translateZCRException. 
> This line in hdfs.c:
> ret = printExceptionAndFree(env, jthr, PRINT_EXC_ALL, "hadoopZeroCopyRead: 
> ZeroCopyCursor#read failed");
> should be:
> ret = printExceptionAndFree(env, exc, PRINT_EXC_ALL, "hadoopZeroCopyRead: 
> ZeroCopyCursor#read failed");
> Plainly, translateZCRException should print the exception (exc) passed in to 
> the function rather than the uninitialized local jthr.
> The fix for HDFS-5169 (part of HDFS-4949) exists on hadoop 2.* branches, but 
> it is missing on hadoop 3 branches including trunk.
> Hadoop 2.8:
> https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c#L2514
> Hadoop 3.0:
> https://github.com/apache/hadoop/blob/branch-3.0/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c#L2691



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-11-01 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11096:
---
Target Version/s: 3.0.1  (was: 3.0.0)

Thanks Sean, I'm going to bump this to 3.0.1 then.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch, HDFS-11096.007.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12681) Fold HdfsLocatedFileStatus into HdfsFileStatus

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227517#comment-16227517
 ] 

Andrew Wang commented on HDFS-12681:


Thanks for bringing this up Chris. In the interest of keeping the branch 
releasable, could we revert for now, possibly retargeting for 3.1.0?

> Fold HdfsLocatedFileStatus into HdfsFileStatus
> --
>
> Key: HDFS-12681
> URL: https://issues.apache.org/jira/browse/HDFS-12681
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chris Douglas
>Priority: Minor
> Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, 
> HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, 
> HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, 
> HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch
>
>
> {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of 
> {{LocatedFileStatus}}. Conversion requires copying common fields and shedding 
> unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to 
> extend {{LocatedFileStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227427#comment-16227427
 ] 

Andrew Wang commented on HDFS-11096:


Sean, given that we've run HDFS rolling upgrades successfully, do you think we 
can pick this up in a later 3.0.x release?

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch, 
> HDFS-11096.006.patch, HDFS-11096.007.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12699) TestMountTable fails with Java 7

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227422#comment-16227422
 ] 

Andrew Wang commented on HDFS-12699:


Inigo, do you mind setting appropriate fix versions for this JIRA?

> TestMountTable fails with Java 7
> 
>
> Key: HDFS-12699
> URL: https://issues.apache.org/jira/browse/HDFS-12699
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
> Attachments: HDFS-12699-branch-2.000.patch, HDFS-12699.000.patch
>
>
> Some of the issues for HDFS-12620 were related to Java 7.
> In particular, we relied on the {{HashMap}} order (which is wrong).
> This worked by chance with Java 8 (trunk) but not in with Java 7 (branch-2).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12485) expunge may fail to remove trash from encryption zone

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227404#comment-16227404
 ] 

Andrew Wang commented on HDFS-12485:


I think the branch-3.0 backport was missed, I backported it.

> expunge may fail to remove trash from encryption zone
> -
>
> Key: HDFS-12485
> URL: https://issues.apache.org/jira/browse/HDFS-12485
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.9.0, 2.8.3, 3.0.0
>
> Attachments: HDFS-12485.001.patch
>
>
> This is related to HDFS-12484, but turns out that even if I have super user 
> permission, -expunge may not remove trash either.
> If I log into Linux as root, and then login as the superuser h...@example.com
> {noformat}
> [root@nightly511-1 ~]# hdfs dfs -rm /scale/b
> 17/09/18 15:21:32 INFO fs.TrashPolicyDefault: Moved: 'hdfs://ns1/scale/b' to 
> trash at: hdfs://ns1/scale/.Trash/hdfs/Current/scale/b
> [root@nightly511-1 ~]# hdfs dfs -expunge
> 17/09/18 15:21:59 INFO fs.TrashPolicyDefault: 
> TrashPolicyDefault#deleteCheckpoint for trashRoot: hdfs://ns1/user/hdfs/.Trash
> 17/09/18 15:21:59 INFO fs.TrashPolicyDefault: 
> TrashPolicyDefault#deleteCheckpoint for trashRoot: hdfs://ns1/user/hdfs/.Trash
> 17/09/18 15:21:59 INFO fs.TrashPolicyDefault: Deleted trash checkpoint: 
> /user/hdfs/.Trash/170918143916
> 17/09/18 15:21:59 INFO fs.TrashPolicyDefault: 
> TrashPolicyDefault#createCheckpoint for trashRoot: hdfs://ns1/user/hdfs/.Trash
> [root@nightly511-1 ~]# hdfs dfs -ls 
> hdfs://ns1/scale/.Trash/hdfs/Current/scale/b
> -rw-r--r--   3 hdfs systest  0 2017-09-18 15:21 
> hdfs://ns1/scale/.Trash/hdfs/Current/scale/b
> {noformat}
> expunge does not remove trash under /scale, because it does not know I am 
> 'hdfs' user.
> {code:title=DistributedFileSystem#getTrashRoots}
> Path ezTrashRoot = new Path(it.next().getPath(),
> FileSystem.TRASH_PREFIX);
> if (!exists(ezTrashRoot)) {
>   continue;
> }
> if (allUsers) {
>   for (FileStatus candidate : listStatus(ezTrashRoot)) {
> if (exists(candidate.getPath())) {
>   ret.add(candidate);
> }
>   }
> } else {
>   Path userTrash = new Path(ezTrashRoot, System.getProperty(
>   "user.name")); --> bug
>   try {
> ret.add(getFileStatus(userTrash));
>   } catch (FileNotFoundException ignored) {
>   }
> }
> {code}
> It should use UGI for user name, rather than system login user name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12495) TestPendingInvalidateBlock#testPendingDeleteUnknownBlocks fails intermittently

2017-10-31 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12495:
---
Fix Version/s: (was: 3.0.0)

> TestPendingInvalidateBlock#testPendingDeleteUnknownBlocks fails intermittently
> --
>
> Key: HDFS-12495
> URL: https://issues.apache.org/jira/browse/HDFS-12495
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2
>Reporter: Eric Badger
>Assignee: Eric Badger
>  Labels: flaky-test
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.3, 3.1.0
>
> Attachments: HDFS-12495.001.patch, HDFS-12495.002.patch
>
>
> {noformat}
> java.net.BindException: Problem binding to [localhost:36701] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:546)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:955)
>   at org.apache.hadoop.ipc.Server.(Server.java:2655)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:968)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:367)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:810)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:954)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1314)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:481)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2611)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2499)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2546)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2152)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeleteUnknownBlocks(TestPendingInvalidateBlock.java:175)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12523) Thread pools in ErasureCodingWorker do not shutdown

2017-10-31 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12523:
---
Fix Version/s: (was: 3.0.0)

> Thread pools in ErasureCodingWorker do not shutdown
> ---
>
> Key: HDFS-12523
> URL: https://issues.apache.org/jira/browse/HDFS-12523
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha4
>Reporter: Lei (Eddy) Xu
>Assignee: Huafeng Wang
> Fix For: 3.0.0-beta1
>
> Attachments: HDFS-12523.001.patch, HDFS-12523.002.patch
>
>
> There is no code path in {{ErasureCodingWorker}} to shutdown its two thread 
> pools: {{stripedReconstructionPool}} and {{stripedReadPool}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12544) SnapshotDiff - support diff generation on any snapshot root descendant directory

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227399#comment-16227399
 ] 

Andrew Wang commented on HDFS-12544:


I think the branch-3.0 backport was missed, I backported it.

> SnapshotDiff - support diff generation on any snapshot root descendant 
> directory
> 
>
> Key: HDFS-12544
> URL: https://issues.apache.org/jira/browse/HDFS-12544
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.0.0
>
> Attachments: HDFS-12544.01.patch, HDFS-12544.02.patch, 
> HDFS-12544.03.patch, HDFS-12544.04.patch, HDFS-12544.05.patch
>
>
> {noformat}
> # hdfs snapshotDiff   
> 
> {noformat}
> Using snapshot diff command, we can generate a diff report between any two 
> given snapshots under a snapshot root directory. The command today only 
> accepts the path that is a snapshot root. There are many deployments where 
> the snapshot root is configured at the higher level directory but the diff 
> report needed is only for a specific directory under the snapshot root. In 
> these cases, the diff report can be filtered for changes pertaining to the 
> directory we are interested in. But when the snapshot root directory is very 
> huge, the snapshot diff report generation can take minutes even if we are 
> interested to know the changes only in a small directory. So, it would be 
> highly performant if the diff report calculation can be limited to only the 
> interesting sub-directory of the snapshot root instead of the whole snapshot 
> root.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12573) Divide the total block metrics into replica and ec

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227397#comment-16227397
 ] 

Andrew Wang commented on HDFS-12573:


I think the branch-3.0 backport was missed, I backported it.

> Divide the total block metrics into replica and ec
> --
>
> Key: HDFS-12573
> URL: https://issues.apache.org/jira/browse/HDFS-12573
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, metrics, namenode
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
> Fix For: 3.0.0
>
> Attachments: HDFS-12573.1.patch, HDFS-12573.2.patch, 
> HDFS-12573.3.patch
>
>
> Following HDFS-10999, let's separate total blocks metrics. It would be useful 
> for administrators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12614) FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider configured

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227395#comment-16227395
 ] 

Andrew Wang commented on HDFS-12614:


I think the branch-3.0 backport was missed, I backported it.

> FSPermissionChecker#getINodeAttrs() throws NPE when INodeAttributesProvider 
> configured
> --
>
> Key: HDFS-12614
> URL: https://issues.apache.org/jira/browse/HDFS-12614
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.0.0
>
> Attachments: HDFS-12614.01.patch, HDFS-12614.02.patch, 
> HDFS-12614.03.patch, HDFS-12614.04.patch, HDFS-12614.test.01.patch
>
>
> When INodeAttributesProvider is configured, and when resolving path (like 
> "/") and checking for permission, the following code when working on 
> {{pathByNameArr}} throws NullPointerException. 
> {noformat}
>   private INodeAttributes getINodeAttrs(byte[][] pathByNameArr, int pathIdx,
>   INode inode, int snapshotId) {
> INodeAttributes inodeAttrs = inode.getSnapshotINode(snapshotId);
> if (getAttributesProvider() != null) {
>   String[] elements = new String[pathIdx + 1];
>   for (int i = 0; i < elements.length; i++) {
> elements[i] = DFSUtil.bytes2String(pathByNameArr[i]);  <===
>   }
>   inodeAttrs = getAttributesProvider().getAttributes(elements, 
> inodeAttrs);
> }
> return inodeAttrs;
>   }
> {noformat}
> Looks like for paths like "/" where the split components based on delimiter 
> "/" can be null, the pathByNameArr array can have null elements and can throw 
> NPE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12619) Do not catch and throw unchecked exceptions if IBRs fail to process

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227394#comment-16227394
 ] 

Andrew Wang commented on HDFS-12619:


I think the branch-3.0 commit was missed, I backported it.

> Do not catch and throw unchecked exceptions if IBRs fail to process
> ---
>
> Key: HDFS-12619
> URL: https://issues.apache.org/jira/browse/HDFS-12619
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha1
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
> Fix For: 2.9.0, 2.8.3, 3.0.0
>
> Attachments: HDFS-12619.001.patch
>
>
> HDFS-9198 added the following code
> {code:title=BlockManager#processIncrementalBlockReport}
> public void processIncrementalBlockReport(final DatanodeID nodeID,
>   final StorageReceivedDeletedBlocks srdb) throws IOException {
> ...
> try {
>   processIncrementalBlockReport(node, srdb);
> } catch (Exception ex) {
>   node.setForceRegistration(true);
>   throw ex;
> }
>   }
> {code}
> In Apache Hadoop 2.7.x ~ 3.0, the code snippet is accepted by Java compiler. 
> However, when I attempted to backport it to a CDH5.3 release (based on Apache 
> Hadoop 2.5.0), the compiler complains the exception is unhandled, because the 
> method defines it throws IOException instead of Exception.
> While the code compiles for Apache Hadoop 2.7.x ~ 3.0, I feel it is not a 
> good practice to catch an unchecked exception and then rethrow it. How about 
> rewriting it with a finally block and a conditional variable?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12622) Fix enumerate in HDFSErasureCoding.md

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227392#comment-16227392
 ] 

Andrew Wang commented on HDFS-12622:


I think the branch-3.0 commit was missed, I backported it.

> Fix enumerate in HDFSErasureCoding.md
> -
>
> Key: HDFS-12622
> URL: https://issues.apache.org/jira/browse/HDFS-12622
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Yiqun Lin
>Priority: Minor
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: HDFS-12622.001.patch, HDFS-12622.001.patch, Screen Shot 
> 2017-10-10 at 17.36.16.png, screenshot.png
>
>
> {noformat}
>   HDFS native implementation of default RS codec leverages Intel ISA-L 
> library to improve the encoding and decoding calculation. To enable and use 
> Intel ISA-L, there are three steps.
>   1. Build ISA-L library. Please refer to the official site 
> "https://github.com/01org/isa-l/; for detail information.
>   2. Build Hadoop with ISA-L support. Please refer to "Intel ISA-L build 
> options" section in "Build instructions for Hadoop" in (BUILDING.txt) in the 
> source code.
>   3. Use `-Dbundle.isal` to copy the contents of the `isal.lib` directory 
> into the final tar file. Deploy Hadoop with the tar file. Make sure ISA-L is 
> available on HDFS clients and DataNodes.
> {noformat}
> Missing empty line before enumerate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227202#comment-16227202
 ] 

Andrew Wang commented on HDFS-11467:


Hey folks, are we planning to close on this in the next few days? Looks like 
HDFS-12682 is pretty close, so not a bad idea to rebase on top of the latest 
patch to get precommit runs and reviews started here.

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12499) dfs.namenode.shared.edits.dir property is currently namenode specific key

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227197#comment-16227197
 ] 

Andrew Wang commented on HDFS-12499:


I've reverted this from trunk and branch-3.0. Apologies for the churn.

One more thing, when we discussed the branching strategy for GA, the community 
consensus was to avoid Hadoop 4 until it was really needed. So, before adding 
another branch, we should start a mailing list discussion with the rationale.

> dfs.namenode.shared.edits.dir property is currently namenode specific key
> -
>
> Key: HDFS-12499
> URL: https://issues.apache.org/jira/browse/HDFS-12499
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Attachments: HDFS-12499.01.patch, HDFS-12499.02.patch
>
>
> HDFS + Federation cluster +QJM
> dfs.shared.edits.dir property can be set as
> 1. dfs.shared.edits.dir.<> 
> 2. dfs.shared.edits.dir.<> .<>
> Configuring both ways are supported currently. Option 2 should not be 
> supported, as for a particular nameservice quorum of journal nodes should be 
> same.
> If option 2 is supported, users can configure for a nameservice Id which is 
> having two namenodes, they can configure different values for journal nodes. 
> which is incorrect.
> Example:
> 
> dfs.nameservices
> ns1,ns2
>   
>   
> dfs.ha.namenodes.ns1
> nn1,nn2
>   
>   
> dfs.ha.namenodes.ns2
> nn1,nn2
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns1
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns1
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns2
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns2
>   
> This jira is to discuss do we need to support 2nd option way of configuring 
> or remove it?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12499) dfs.namenode.shared.edits.dir property is currently namenode specific key

2017-10-31 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12499:
---
Fix Version/s: (was: 3.0.0)

> dfs.namenode.shared.edits.dir property is currently namenode specific key
> -
>
> Key: HDFS-12499
> URL: https://issues.apache.org/jira/browse/HDFS-12499
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Attachments: HDFS-12499.01.patch, HDFS-12499.02.patch
>
>
> HDFS + Federation cluster +QJM
> dfs.shared.edits.dir property can be set as
> 1. dfs.shared.edits.dir.<> 
> 2. dfs.shared.edits.dir.<> .<>
> Configuring both ways are supported currently. Option 2 should not be 
> supported, as for a particular nameservice quorum of journal nodes should be 
> same.
> If option 2 is supported, users can configure for a nameservice Id which is 
> having two namenodes, they can configure different values for journal nodes. 
> which is incorrect.
> Example:
> 
> dfs.nameservices
> ns1,ns2
>   
>   
> dfs.ha.namenodes.ns1
> nn1,nn2
>   
>   
> dfs.ha.namenodes.ns2
> nn1,nn2
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns1
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns1
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns2
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns2
>   
> This jira is to discuss do we need to support 2nd option way of configuring 
> or remove it?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-12499) dfs.namenode.shared.edits.dir property is currently namenode specific key

2017-10-31 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reopened HDFS-12499:


> dfs.namenode.shared.edits.dir property is currently namenode specific key
> -
>
> Key: HDFS-12499
> URL: https://issues.apache.org/jira/browse/HDFS-12499
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Attachments: HDFS-12499.01.patch, HDFS-12499.02.patch
>
>
> HDFS + Federation cluster +QJM
> dfs.shared.edits.dir property can be set as
> 1. dfs.shared.edits.dir.<> 
> 2. dfs.shared.edits.dir.<> .<>
> Configuring both ways are supported currently. Option 2 should not be 
> supported, as for a particular nameservice quorum of journal nodes should be 
> same.
> If option 2 is supported, users can configure for a nameservice Id which is 
> having two namenodes, they can configure different values for journal nodes. 
> which is incorrect.
> Example:
> 
> dfs.nameservices
> ns1,ns2
>   
>   
> dfs.ha.namenodes.ns1
> nn1,nn2
>   
>   
> dfs.ha.namenodes.ns2
> nn1,nn2
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns1
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns1
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns2
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns2
>   
> This jira is to discuss do we need to support 2nd option way of configuring 
> or remove it?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12499) dfs.namenode.shared.edits.dir property is currently namenode specific key

2017-10-31 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227185#comment-16227185
 ] 

Andrew Wang commented on HDFS-12499:


Sorry, I was off yesterday. I just checked, and Cloudera Manager actually does 
set it in style 2.

Two questions reviewing the JIRA:
* Is there some way to make this backwards compatible, like having one conf 
override the other?
* Is the rationale for this change to simplify the configuration options?

Since this doesn't look like a critical bug fix and it does have real world 
compatibility implications, I'm going to revert this while we discuss. I was 
planning on posting an RC for 3.0.0 GA this week.


> dfs.namenode.shared.edits.dir property is currently namenode specific key
> -
>
> Key: HDFS-12499
> URL: https://issues.apache.org/jira/browse/HDFS-12499
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
> Fix For: 3.0.0
>
> Attachments: HDFS-12499.01.patch, HDFS-12499.02.patch
>
>
> HDFS + Federation cluster +QJM
> dfs.shared.edits.dir property can be set as
> 1. dfs.shared.edits.dir.<> 
> 2. dfs.shared.edits.dir.<> .<>
> Configuring both ways are supported currently. Option 2 should not be 
> supported, as for a particular nameservice quorum of journal nodes should be 
> same.
> If option 2 is supported, users can configure for a nameservice Id which is 
> having two namenodes, they can configure different values for journal nodes. 
> which is incorrect.
> Example:
> 
> dfs.nameservices
> ns1,ns2
>   
>   
> dfs.ha.namenodes.ns1
> nn1,nn2
>   
>   
> dfs.ha.namenodes.ns2
> nn1,nn2
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns1
>   
> 
> dfs.namenode.shared.edits.dir.ns1.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns1
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-1:8485;mycluster-node-2:8485;mycluster-node-3:8485/ns2
>   
>   
> dfs.namenode.shared.edits.dir.ns2.nn1
> 
> qjournal://mycluster-node-3:8485;mycluster-node-4:8485;mycluster-node-5:8485/ns2
>   
> This jira is to discuss do we need to support 2nd option way of configuring 
> or remove it?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12686) Erasure coding system policy state is not correctly saved and loaded during real cluster restart

2017-10-20 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12686:
---
Priority: Blocker  (was: Critical)

> Erasure coding system policy state is not correctly saved and loaded during 
> real cluster restart
> 
>
> Key: HDFS-12686
> URL: https://issues.apache.org/jira/browse/HDFS-12686
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: SammiChen
>Assignee: SammiChen
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
>
> Inspired by HDFS-12682,  I found the system erasure coding policy state will  
> not  be correctly saved and loaded in a real cluster.  Through there are such 
> kind of unit tests and all are passed with MiniCluster. It's because the 
> MiniCluster keeps the same static system erasure coding policy object after 
> the NN restart operation. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-11467) Support ErasureCoding section in OIV XML/ReverseXML

2017-10-20 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11467:
---
Priority: Blocker  (was: Major)

> Support ErasureCoding section in OIV XML/ReverseXML
> ---
>
> Key: HDFS-11467
> URL: https://issues.apache.org/jira/browse/HDFS-11467
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: tools
>Affects Versions: 3.0.0-alpha4
>Reporter: Wei-Chiu Chuang
>Assignee: Huafeng Wang
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-11467.001.patch, HDFS-11467.002.patch
>
>
> As discussed in HDFS-7859, after ErasureCoding section is added into fsimage, 
> we would like to also support exporting this section into an XML back and 
> forth using the OIV tool.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12497) Re-enable TestDFSStripedOutputStreamWithFailure tests

2017-10-20 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12497:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks Huafeng, committed to trunk and branch-3.0!

> Re-enable TestDFSStripedOutputStreamWithFailure tests
> -
>
> Key: HDFS-12497
> URL: https://issues.apache.org/jira/browse/HDFS-12497
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Huafeng Wang
>  Labels: flaky-test, hdfs-ec-3.0-must-do
> Fix For: 3.0.0
>
> Attachments: HDFS-12497.001.patch, HDFS-12497.002.patch, 
> HDFS-12497.003.patch, HDFS-12497.004.patch
>
>
> We disabled this suite of tests in HDFS-12417 since they were very flaky. We 
> should fix these tests and re-enable them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11096) Support rolling upgrade between 2.x and 3.x

2017-10-20 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213176#comment-16213176
 ] 

Andrew Wang commented on HDFS-11096:


Folks, is this going to be committed by the end of the month? Haven't seen an 
update recently.

> Support rolling upgrade between 2.x and 3.x
> ---
>
> Key: HDFS-11096
> URL: https://issues.apache.org/jira/browse/HDFS-11096
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rolling upgrades
>Affects Versions: 3.0.0-alpha1
>Reporter: Andrew Wang
>Assignee: Sean Mackrory
>Priority: Blocker
> Attachments: HDFS-11096.001.patch, HDFS-11096.002.patch, 
> HDFS-11096.003.patch, HDFS-11096.004.patch, HDFS-11096.005.patch
>
>
> trunk has a minimum software version of 3.0.0-alpha1. This means we can't 
> rolling upgrade between branch-2 and trunk.
> This is a showstopper for large deployments. Unless there are very compelling 
> reasons to break compatibility, let's restore the ability to rolling upgrade 
> to 3.x releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED

2017-10-19 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-12682:
---
Priority: Blocker  (was: Major)

> ECAdmin -listPolicies will always show policy state as DISABLED
> ---
>
> Key: HDFS-12682
> URL: https://issues.apache.org/jira/browse/HDFS-12682
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Blocker
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12682.01.patch
>
>
> On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as 
> DISABLED.
> {noformat}
> [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3, State=DISABLED]
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED]
> [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec
> XOR-2-1-1024k
> {noformat}
> This is because when [deserializing 
> protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942],
>  the static instance of [SystemErasureCodingPolicies 
> class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101]
>  is first checked, and always returns the cached policy objects, which are 
> created by default with state=DISABLED.
> All the existing unit tests pass, because that static instance that the 
> client (e.g. ECAdmin) reads in unit test is updated by NN. :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED

2017-10-18 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210230#comment-16210230
 ] 

Andrew Wang commented on HDFS-12682:


My concern was actually for the clients, since there are apps (Hive, Impala) 
that do listing of thousands or millions of files.

I assume we can do a hybrid approach, where we get most of the fields from the 
static class, but get the enabled/disabled state from the PB?

> ECAdmin -listPolicies will always show policy state as DISABLED
> ---
>
> Key: HDFS-12682
> URL: https://issues.apache.org/jira/browse/HDFS-12682
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>  Labels: hdfs-ec-3.0-must-do
>
> On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as 
> DISABLED.
> {noformat}
> [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3, State=DISABLED]
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED]
> [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec
> XOR-2-1-1024k
> {noformat}
> This is because when [deserializing 
> protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942],
>  the static instance of [SystemErasureCodingPolicies 
> class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101]
>  is first checked, and always returns the cached policy objects, which are 
> created by default with state=DISABLED.
> All the existing unit tests pass, because that static instance that the 
> client (e.g. ECAdmin) reads in unit test is updated by NN. :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12310) [SPS]: Provide an option to track the status of in progress requests

2017-10-17 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208474#comment-16208474
 ] 

Andrew Wang commented on HDFS-12310:


Sorry, I don't think I'll have time to review this. Maybe [~eddyxu]?

> [SPS]: Provide an option to track the status of in progress requests
> 
>
> Key: HDFS-12310
> URL: https://issues.apache.org/jira/browse/HDFS-12310
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Uma Maheswara Rao G
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-12310-HDFS-10285-01.patch, 
> HDFS-12310-HDFS-10285-02.patch, HDFS-12310-HDFS-10285-03.patch
>
>
> As per the [~andrew.wang] 's review comments in HDFS-10285, This is the JIRA 
> for tracking about the options how we track the progress of SPS requests.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12669) Implement toString() for EditLogInputStream

2017-10-17 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208167#comment-16208167
 ] 

Andrew Wang commented on HDFS-12669:


Sure, I added you as a contributor to all the hadoop-y projects.

> Implement toString() for EditLogInputStream
> ---
>
> Key: HDFS-12669
> URL: https://issues.apache.org/jira/browse/HDFS-12669
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Chao Sun
>Priority: Minor
>
> Currently {{EditLogInputStream}} has {{getName()}} but doesn't implement 
> {{toString()}}. The latter could be useful in debugging. Currently it just 
> print out messages like:
> {code}
> 2017-10-16 20:41:13,456 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@1eb6749b 
> expecting start txid #8137
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 4450 matches

Mail list logo