[jira] [Created] (HDFS-12224) Add tests to TestJournalNodeSync for sync after JN downtime

2017-07-28 Thread Hanisha Koneru (JIRA)
Hanisha Koneru created HDFS-12224:
-

 Summary: Add tests to TestJournalNodeSync for sync after JN 
downtime
 Key: HDFS-12224
 URL: https://issues.apache.org/jira/browse/HDFS-12224
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


Adding unit tests for testing JN sync when the JN has a downtime and is 
formatted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12223) Rebasing HDFS-10467

2017-07-28 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri resolved HDFS-12223.

Resolution: Fixed
  Assignee: Inigo Goiri

> Rebasing HDFS-10467
> ---
>
> Key: HDFS-12223
> URL: https://issues.apache.org/jira/browse/HDFS-12223
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Inigo Goiri
>Assignee: Inigo Goiri
> Fix For: HDFS-10467
>
> Attachments: HDFS-12223-HDFS-10467.patch
>
>
> Erasure Coding methods where added to {{ClientProtocol}}. Adding those 
> methods to {{RouterRpcServer}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12223) Rebasing HDFS-10467

2017-07-28 Thread Inigo Goiri (JIRA)
Inigo Goiri created HDFS-12223:
--

 Summary: Rebasing HDFS-10467
 Key: HDFS-12223
 URL: https://issues.apache.org/jira/browse/HDFS-12223
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Inigo Goiri


Erasure Coding methods where added to {{ClientProtocol}}. Adding those methods 
to {{RouterRpcServer}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12222) Add EC information to BlockLocations

2017-07-28 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-1:
--

 Summary: Add EC information to BlockLocations
 Key: HDFS-1
 URL: https://issues.apache.org/jira/browse/HDFS-1
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0-alpha1
Reporter: Andrew Wang


HDFS applications query block location information to compute splits. One 
example of this is FileInputFormat:

https://github.com/apache/hadoop/blob/d4015f8628dd973c7433639451a9acc3e741d2a2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L346

You see bits of code like this that calculate offsets as follows:

{noformat}
long bytesInThisBlock = blkLocations[startIndex].getOffset() + 
  blkLocations[startIndex].getLength() - offset;
{noformat}

EC confuses this since the block locations include parity block locations as 
well, which are not part of the logical file length. This messes up the offset 
calculation and thus topology/caching information too.

Applications can figure out what's a parity block by reading the EC policy and 
then parsing the schema, but it'd be a lot better if we exposed this more 
generically in BlockLocation instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12221) Replace xcerces in XmlEditsVisitor

2017-07-28 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-12221:


 Summary: Replace xcerces in XmlEditsVisitor 
 Key: HDFS-12221
 URL: https://issues.apache.org/jira/browse/HDFS-12221
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0-alpha4
Reporter: Lei (Eddy) Xu


XmlEditsVisitor should use new XML capability  in the newer JDK, to make JAR 
shading easier (HADOOP-14672)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12220) hdfs' parallel tests don't work for Windows

2017-07-28 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HDFS-12220:
---

 Summary: hdfs' parallel tests don't work for Windows
 Key: HDFS-12220
 URL: https://issues.apache.org/jira/browse/HDFS-12220
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0-beta1
 Environment: Windows
Reporter: Allen Wittenauer


create-parallel-tests-dirs in hadoop-hdfs-project/hadoop-hdfs/pom.xml fail with:

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-antrun-plugin:1.7:run 
(create-parallel-tests-dirs) on project hadoop-hdfs: An Ant BuildException has 
occured: Directory 
F:\jenkins\jenkins-slave\workspace\hadoop-trunk-win\s\hadoop-hdfs-project\hadoop-hdfs\jenkinsjenkins-slaveworkspacehadoop-trunk-winshadoop-hdfs-projecthadoop-hdfs
arget\test\data\1 creation was not successful for an unknown reason
[ERROR] around Ant part 

[jira] [Created] (HDFS-12219) Javadoc for FSNamesystem#getMaxObjects is incorrect

2017-07-28 Thread Erik Krogen (JIRA)
Erik Krogen created HDFS-12219:
--

 Summary: Javadoc for FSNamesystem#getMaxObjects is incorrect
 Key: HDFS-12219
 URL: https://issues.apache.org/jira/browse/HDFS-12219
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Trivial


The Javadoc states that this represents the total number of objects in the 
system, but it really represents the maximum allowed number of objects (as 
correctly stated on the Javadoc for {{FSNamesystemMBean#getMaxObjects()}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Merge Storage Policy Satisfier (SPS) [HDFS-10285] feature branch to trunk

2017-07-28 Thread Andrew Wang
Hi Uma,

> If there are still plans to make changes that affect compatibility (the
> hybrid RPC and bulk DN work mentioned sound like they would), then we can
> cut branch-3 first, or wait to merge until after these tasks are finished.
> [Uma] We don’t see that 2 items as high priority for the feature. Users
> would be able to use the feature with current code base and API. So, we
> would consider them after branch-3 only. That should be perfectly fine IMO.
> The current API is very much useful for Hbase scenario. In Hbase case, they
> will rename files under to different policy directory. They will not set
> the policies always. So, when rename files under to different policy
> directory, they can simply call satisfyStoragePolicy, they don’t need any
> hybrid API.
>

Great to hear. It'd be nice to define which usecases are met by the current
version of SPS, and which will be handled after the merge.

A bit more detail in the design doc on how HBase would use this feature
would also be helpful. Is there an HBase JIRA already?

I also spent some more time with the design doc and posted a few questions
on the JIRA.

Best,
Andrew


[jira] [Created] (HDFS-12218) Rename split EC / replicated block metrics in BlockManager

2017-07-28 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-12218:
--

 Summary: Rename split EC / replicated block metrics in BlockManager
 Key: HDFS-12218
 URL: https://issues.apache.org/jira/browse/HDFS-12218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding, metrics
Affects Versions: 3.0.0-beta1
Reporter: Andrew Wang


Noted in HDFS-12206, we should propagate the naming changes made in HDFS-12206 
for FSNamesystem into BlockManager and related classes. Also an opportunity to 
clarify usage of "ECBlocks" vs "ECBlockGroups" in some names.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12217) HDFS snapshots doesn't capture all open files when one of open file is deleted

2017-07-28 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-12217:
-

 Summary: HDFS snapshots doesn't capture all open files when one of 
open file is deleted
 Key: HDFS-12217
 URL: https://issues.apache.org/jira/browse/HDFS-12217
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0-alpha1
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


With the fix for HDFS-11402, HDFS Snapshots can additionally capture all the 
open files. Just like all other files, these open files in the snapshots will 
remain immutable. But, sometimes it is found that snapshots fail to capture all 
the open files in the system.

Under the following conditions, LeaseManager will fail to find INode 
corresponding to an active lease 
-- file is opened for writing (LeaseManager allots a lease)
-- files is deleted while it is still open for writing and active lease
-- file is not referenced in any other Snapshots/Trash

{{INode[] LeaseManager#getINodesWithLease()}} can thus return null for few 
leases there by causing the caller to trip over and not return all the open 
files needed by the snapshot manager.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12216) Ozone: TestKeys and TestKeysRatis are failing consistently

2017-07-28 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDFS-12216:


 Summary: Ozone: TestKeys and TestKeysRatis are failing consistently
 Key: HDFS-12216
 URL: https://issues.apache.org/jira/browse/HDFS-12216
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh
 Fix For: HDFS-7240


TestKeys and TestKeysRatis are failing consistently as noted in test logs for 
HDFS-12183

TestKeysRatis is failing because of the following error
{code}
2017-07-28 23:11:28,783 [StateMachineUpdater-127.0.0.1:55793] ERROR 
impl.StateMachineUpdater (ExitUtils.java:terminate(80)) - Terminating with exit 
status 2: StateMachineUpdater-127.0.0.1:55793: the StateMachineUpdater hits 
Throwable
org.iq80.leveldb.DBException: Closed
at org.fusesource.leveldbjni.internal.JniDB.put(JniDB.java:123)
at org.apache.hadoop.utils.LevelDBStore.put(LevelDBStore.java:98)
at 
org.apache.hadoop.ozone.container.common.impl.KeyManagerImpl.putKey(KeyManagerImpl.java:90)
at 
org.apache.hadoop.ozone.container.common.impl.Dispatcher.handlePutKey(Dispatcher.java:547)
at 
org.apache.hadoop.ozone.container.common.impl.Dispatcher.keyProcessHandler(Dispatcher.java:206)
at 
org.apache.hadoop.ozone.container.common.impl.Dispatcher.dispatch(Dispatcher.java:110)
at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatch(ContainerStateMachine.java:94)
at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:81)
at 
org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:913)
at 
org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:142)
at java.lang.Thread.run(Thread.java:745)
{code}

where as TestKeys is failing because of
{code}
2017-07-28 23:14:20,889 [Thread-486] INFO  scm.XceiverClientManager 
(XceiverClientManager.java:getClient(158)) - exception 
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection 
refused: /127.0.0.1:55914
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12215) DataNode#transferBlock does not create its daemon in the xceiver thread group

2017-07-28 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-12215:


 Summary: DataNode#transferBlock does not create its daemon in the 
xceiver thread group
 Key: HDFS-12215
 URL: https://issues.apache.org/jira/browse/HDFS-12215
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0-alpha4
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu


As mentioned in HDFS-12044, DataNode#transferBlock daemon is not calculated to 
xceiver count.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-07-28 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/477/

[Jul 27, 2017 5:44:50 PM] (varunsaxena) YARN-5548. Use MockRMMemoryStateStore 
to reduce test failures (Bibin A
[Jul 27, 2017 6:40:45 PM] (varunsaxena) Addendum for YARN-5548. Use 
MockRMMemoryStateStore to reduce test
[Jul 27, 2017 7:02:57 PM] (shv) HDFS-11896. Non-dfsUsed will be doubled on dead 
node re-registration.
[Jul 27, 2017 8:04:50 PM] (aw) HADOOP-14692. Upgrade Apache Rat
[Jul 27, 2017 8:23:15 PM] (jitendra) HDFS-2319. Add test cases for FSshell 
-stat. Contributed by XieXianshan
[Jul 27, 2017 11:48:24 PM] (yzhang) HDFS-12190. Enable 'hdfs dfs -stat' to 
display access time. Contributed
[Jul 28, 2017 12:10:52 AM] (aajisaka) HADOOP-11875. [JDK9] Adding a second copy 
of Hamlet without _ as a
[Jul 28, 2017 6:19:39 AM] (yufei) YARN-6864. FSPreemptionThread cleanup for 
readability. (Daniel Templeton




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs-client 
   Possible exposure of partially initialized object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:[line 2888] 
   org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:[line 105] 

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method Dereferenced at 
JournalNode.java:org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus()
 due to return value of called method Dereferenced at JournalNode.java:[line 
302] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId At HdfsServerConstants.java:clusterId 
At HdfsServerConstants.java:[line 193] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force At HdfsServerConstants.java:force At 
HdfsServerConstants.java:[line 217] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat At 
HdfsServerConstants.java:isForceFormat At HdfsServerConstants.java:[line 229] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat At 
HdfsServerConstants.java:isInteractiveFormat At HdfsServerConstants.java:[line 
237] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at 
DataStorage.java:org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File,
 File, int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at DataStorage.java:[line 1339] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:[line 258] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path, 
BasicFileAttributes) due to return value of called method Dereferenced at 
NNUpgradeUtil.java:org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path,
 BasicFileAttributes) due to return value of called method Dereferenced at 
NNUpgradeUtil.java:[line 133] 
   Useless condition:argv.length >= 1 at this point At DFSAdmin.java:[line 
2100] 
   Useless condition:numBlocks == -1 at this point At 
ImageLoaderCurrent.java:[line 727] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Useless object stored in variable removedNullContainers of method 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeOrTrackCompletedContainersFromContext(List)
 At NodeStatusUpdaterImpl.java:removedNullContainers of method 

[jira] [Created] (HDFS-12214) Rename configuration property 'dfs.storage.policy.satisfier.activate' to 'dfs.storage.policy.satisfier.enable'

2017-07-28 Thread Rakesh R (JIRA)
Rakesh R created HDFS-12214:
---

 Summary: Rename configuration property 
'dfs.storage.policy.satisfier.activate' to 'dfs.storage.policy.satisfier.enable'
 Key: HDFS-12214
 URL: https://issues.apache.org/jira/browse/HDFS-12214
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R


This sub-task is to address [~andrew.wang]'s review comments. Please refer the 
[review 
comment|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16103734=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16103734]
 in HDFS-10285 umbrella jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12213) Ozone: Corona: Support for online mode

2017-07-28 Thread Nandakumar (JIRA)
Nandakumar created HDFS-12213:
-

 Summary: Ozone: Corona: Support for online mode
 Key: HDFS-12213
 URL: https://issues.apache.org/jira/browse/HDFS-12213
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Nandakumar
Assignee: Nandakumar


This jira brings support for online mode in corona.
In online mode, common crawl data from AWS will be used to populate ozone with 
data. Default source is [CC-MAIN-2017-17/warc.paths.gz | 
https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2017-17/warc.paths.gz] 
(it contains the path to actual data segment), user can override this using 
-source.
The following values are derived from URL of Common Crawl data
* Domain will be used as Volume
* URL will be used as Bucket
* FileName will be used as Key



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org




[jira] [Created] (HDFS-12212) Options.Rename.To_TRASH is considered even when Options.Rename.NONE is specified

2017-07-28 Thread Vinayakumar B (JIRA)
Vinayakumar B created HDFS-12212:


 Summary: Options.Rename.To_TRASH is considered even when 
Options.Rename.NONE is specified
 Key: HDFS-12212
 URL: https://issues.apache.org/jira/browse/HDFS-12212
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinayakumar B


HDFS-8312 introduced {{Options.Rename.TO_TRASH}} to differentiate the movement 
to trash and other renames for permission checks.

When Options.Rename.NONE is passed also TO_TRASH is considered for rename and 
wrong permissions are checked for rename.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12211) Block Storage: Add start-cblock.sh to quickly start cblock.

2017-07-28 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDFS-12211:


 Summary: Block Storage:  Add start-cblock.sh to quickly start 
cblock.
 Key: HDFS-12211
 URL: https://issues.apache.org/jira/browse/HDFS-12211
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh
 Fix For: HDFS-7240


Add start-cblock.sh to quickly start cblock.

To start cblock currently, scm, cblock manager and jscsi server need to be 
started separately.
This jira with add a new script to automate this process.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12210) Block Storage: volume creation times out while creating 3TB volume because of too many containers

2017-07-28 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDFS-12210:


 Summary: Block Storage: volume creation times out while creating 
3TB volume because of too many containers
 Key: HDFS-12210
 URL: https://issues.apache.org/jira/browse/HDFS-12210
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh
 Fix For: HDFS-7240


Volume creation times out while creating 3TB volume because of too many 
containers

{code}
[hdfs@ctr-e134-1499953498516-64773-01-03 ~]$ 
/opt/hadoop/hadoop-3.0.0-beta1-SNAPSHOT/bin/hdfs cblock -c bilbo disk1 3TB 4
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/hadoop/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/hadoop/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs/lib/logback-classic-1.0.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/07/28 09:32:40 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
17/07/28 09:32:40 INFO cli.CBlockCli: create volume:[bilbo, disk1, 3TB, 4]
17/07/28 09:33:10 ERROR cli.CBlockCli: java.net.SocketTimeoutException: Call 
From ctr-e134-1499953498516-64773-01-03.hwx.site/172.27.51.64 to 
0.0.0.0:9810 failed on socket timeout exception: 
java.net.SocketTimeoutException: 3 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/172.27.51.64:59317 remote=/0.0.0.0:9810]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
{code}

Looking into the logs it can be seen that the volume 614 containers were 
created before the timeout.
{code}
2017-07-28 09:32:40,853 INFO org.apache.hadoop.cblock.CBlockManager: Create 
volume received: userName: bilbo volumeName: disk1 volumeSize: 3298534883328 
blockSize: 4096
2017-07-28 09:32:42,545 INFO 
org.apache.hadoop.scm.client.ContainerOperationClient: Created container 
bilbo:disk1#0 leader:172.27.50.192:9866 machines:[172.27.50.192:9866] 
replication factor:1
2017-07-28 09:32:43,213 INFO 
org.apache.hadoop.scm.client.ContainerOperationClient: Created container 
bilbo:disk1#1 leader:172.27.51.65:9866 machines:[172.27.51.65:9866] replication 
factor:1
2017-07-28 09:32:43,484 INFO 
org.apache.hadoop.scm.client.ContainerOperationClient: Created container 
bilbo:disk1#2 leader:172.27.50.192:9866 machines:[172.27.50.192:9866] 
replication factor:1
.
.
.
.
2017-07-28 09:35:01,712 INFO 
org.apache.hadoop.scm.client.ContainerOperationClient: Created container 
bilbo:disk1#612 leader:172.27.50.128:9866 machines:[172.27.50.128:9866] 
replication factor:1
2017-07-28 09:35:01,963 INFO 
org.apache.hadoop.scm.client.ContainerOperationClient: Created container 
bilbo:disk1#613 leader:172.27.50.128:9866 machines:[172.27.50.128:9866] 
replication factor:1
2017-07-28 09:35:02,256 INFO 
org.apache.hadoop.scm.client.ContainerOperationClient: Created container 
bilbo:disk1#614 leader:172.27.50.192:9866 machines:[172.27.50.192:9866] 
replication factor:1
2017-07-28 09:35:02,358 INFO org.apache.hadoop.cblock.CBlockManager: Create 
volume received: userName: bilbo volumeName: disk2 volumeSize: 1099511627776 
blockSize: 4096
2017-07-28 09:35:02,368 WARN org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 9810, call Call#0 Retry#0 
org.apache.hadoop.cblock.protocolPB.CBlockServiceProtocol.createVolume from 
172.27.51.64:59
317: output error
2017-07-28 09:35:02,369 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 9810 caught an exception
java.nio.channels.ClosedChannelException
at 
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:270)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:461)
at org.apache.hadoop.ipc.Server.channelWrite(Server.java:3242)
at org.apache.hadoop.ipc.Server.access$1700(Server.java:137)
at 
org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:1466)
at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1536)
at 
org.apache.hadoop.ipc.Server$Connection.sendResponse(Server.java:2586)
at org.apache.hadoop.ipc.Server$Connection.access$300(Server.java:1608)
at org.apache.hadoop.ipc.Server$RpcCall.doResponse(Server.java:933)
at org.apache.hadoop.ipc.Server$Call.sendResponse(Server.java:767)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:878)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)

[jira] [Created] (HDFS-12209) VolumeScanner scan cursor not save periodic

2017-07-28 Thread fatkun (JIRA)
fatkun created HDFS-12209:
-

 Summary: VolumeScanner scan cursor not save periodic
 Key: HDFS-12209
 URL: https://issues.apache.org/jira/browse/HDFS-12209
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 2.6.0
 Environment: cdh5.4.0
Reporter: fatkun


The bug introduce from HDFS-7430 , the time is not same, one is monotonicMs and 
other is clock time

VolumeScanner.java
{code:java}
long saveDelta = monotonicMs - curBlockIter.getLastSavedMs();
if (saveDelta >= conf.cursorSaveMs) {
  LOG.debug("{}: saving block iterator {} after {} ms.",
  this, curBlockIter, saveDelta);
  saveBlockIterator(curBlockIter);
}
{code}

curBlockIter.getLastSavedMs() init here

FsVolumeImpl.java
{code:java}
BlockIteratorState() {
  lastSavedMs = iterStartMs = Time.now();
  curFinalizedDir = null;
  curFinalizedSubDir = null;
  curEntry = null;
  atEnd = false;
}
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Merge Storage Policy Satisfier (SPS) [HDFS-10285] feature branch to trunk

2017-07-28 Thread Gangumalla, Uma
Hi Andrew, Thanks a lot for reviewing.

Your understanding on the 2 factors are totally right. More than 90% of code 
was newly added and very less portion of existing code was touched, that is for 
NN RPCs and DN messages. We can see that in combined patch stats ( only 45 
lines with "-“ )

> If there are still plans to make changes that affect compatibility (the 
> hybrid RPC and bulk DN work mentioned sound like they would), then we can cut 
> branch-3 first, or wait to merge until after these tasks are finished.
[Uma] We don’t see that 2 items as high priority for the feature. Users would 
be able to use the feature with current code base and API. So, we would 
consider them after branch-3 only. That should be perfectly fine IMO. The 
current API is very much useful for Hbase scenario. In Hbase case, they will 
rename files under to different policy directory. They will not set the 
policies always. So, when rename files under to different policy directory, 
they can simply call satisfyStoragePolicy, they don’t need any hybrid API.

>* Possible impact when this feature is disabled
[Uma] Related to this point, I wanted to highlight about dynamic activation and 
deactivation of the feature.That means, without restarting Namenode, feature 
can be disabled/enabled.
If feature is disabled, there should be 0 impact. As we have dynamic enabling 
feature, we will not even initialize threads if feature is disabled. The 
service will be initialized when enabled. For easy review, please look at the 
last section in this documentation 
ArchivalStorage.html


Also Tiered storage + hdfs mounts solution wants to use SPS feature. 
https://issues.apache.org/jira/browse/HDFS-12090 . So, having this SPS upstream 
would allow HDFS-12090( dependent) feature to proceed.(I don’t say, we have to 
merge because of this reason alone, but I would just like to mention about it 
as an endorsement to the feature. :-) )

Regards,
Uma

From: Andrew Wang >
Date: Thursday, July 27, 2017 at 12:15 PM
To: Uma Gangumalla >
Cc: "hdfs-dev@hadoop.apache.org" 
>
Subject: Re: [DISCUSS] Merge Storage Policy Satisfier (SPS) [HDFS-10285] 
feature branch to trunk

Hi Uma, Rakesh,

First off, I like the idea of this feature. It'll definitely make HSM easier to 
use.

With my RM hat on, I gave the patch a quick skim looking for:

* Possible impact when this feature is disabled
* API stability and other compat concerns

At a high-level, it looks like it uses xattrs rather than new edit log ops to 
track files being moved. Some new NN RPCs and DN messages added to interact 
with the feature. Almost entirely new code that doesn't modify the guts of HDFS 
much.

Could you comment further on these two concerns? We're closing in on 
3.0.0-beta1, so the merge of any large amount of new code makes me wary. If 
there are still plans to make changes that affect compatibility (the hybrid RPC 
and bulk DN work mentioned sound like they would), then we can cut branch-3 
first, or wait to merge until after these tasks are finished.

Best,
Andrew



On Mon, Jul 24, 2017 at 11:35 PM, Gangumalla, Uma 
> wrote:
Dear All,

I would like to propose Storage Policy Satisfier(SPS) feature merge into trunk. 
We have been working on this feature from last several months. This feature 
received the contributions from different companies. All of the feature 
development happened smoothly and collaboratively in JIRAs.

Detailed design document is available in JIRA: 
Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf
Test report attached to JIRA: 
HDFS-SPS-TestReport-20170708.pdf

Short Description of the feature:-
   Storage Policy Satisfier feature is to aim the distributed HDFS applications 
to schedule the block movements easily.
   When storage policy change happened, user can invoke the 
satisfyStoragePolicy api to trigger the block storage movements.
   Block movement tasks will be assigned to datanodes and movements will happen 
distributed fashion.
   Block level movement tracking also has been distributed to Dns to avoid the 
load on Namenodes.
   A co-ordinator Datanode tracks all the blocks associated to a 
blockCollection and send the consolidated final results to Namenode.
   If movement result is failure, Namenode will re-schedule the block movements.

Development branch is: HDFS-10285
No of JIRAs Resolved: 38
Pending JIRAs: 4 (I don’t think