[jira] [Created] (HADOOP-9629) Support Azure Blob Storage as a file system in Hadoop
Mostafa Elhemali created HADOOP-9629: Summary: Support Azure Blob Storage as a file system in Hadoop Key: HADOOP-9629 URL: https://issues.apache.org/jira/browse/HADOOP-9629 Project: Hadoop Common Issue Type: Improvement Reporter: Mostafa Elhemali Assignee: Mostafa Elhemali h2. Description This JIRA incorporates adding a new file system implementation for accessing Windows Azure Blob storage from within Hadoop, such as using blobs as input to MR jobs or configuring MR jobs to put their output directly into blob storage. h2. High level design At a high level, the code here extends the FileSystem class to provide an implementation for accessing blob storage; the scheme asv is used for accessing it over HTTP, and asvs for accessing over HTTPS. We use the URI scheme: {code}asv[s]://container@account/path/to/file{code} to address individual blobs. We use the standard Azure Java SDK (com.microsoft.windowsazure) to do most of the work. In order to map a hierarchical file system over the flat name-value pair nature of blob storage, we create a specially tagged blob named path/to/dir whenever we create a directory called path/to/dir, then files under that are stored as normal blobs path/to/dir/file. We have many metrics implemented for it using the Metrics2 interface. Tests are implemented mostly using a mock implementation for the Azure SDK functionality, with an option to test against a real blob storage if configured (instructions provided inside in RunningLiveAsvTests.txt). h2. Credits and history This has been ongoing work for a while, and the early version of this work can be seen in HADOOP-8079. This JIRA is a significant revision of that and we'll post the patch here for Hadoop trunk first, then post a patch for branch-1 as well for backporting the functionality if accepted. Credit for this work goes to the early team: Min Wei, David Lao, Lengning Liu and Alexander Stojanovic as well as multiple people who have taken over this work since then (hope I don't forget anyone): Dexter Bradshaw, Johannes Klein, Ivan Mitic, Michael Rys and Mostafa Elhemali. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-9526) TestShellCommandFencer and TestShell fail on Windows
[ https://issues.apache.org/jira/browse/HADOOP-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chuan Liu reopened HADOOP-9526: --- TestShellCommandFencer will fail if there is a machine named ‘host’ in the network. The %target_address% environment variable used in the test was from result of InetSocketAddress.getAddress(). The method will return host/ip when the host actually exists in the network. When the test comparing the log output, it assumes there is no ip in the address. TestShellCommandFencer and TestShell fail on Windows Key: HADOOP-9526 URL: https://issues.apache.org/jira/browse/HADOOP-9526 Project: Hadoop Common Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.1.0-beta Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 3.0.0, 2.1.0-beta Attachments: HADOOP-9526.001.patch, HADOOP-9526.002.patch The following TestShellCommandFencer tests fail on Windows. # testTargetAsEnvironment # testConfAsEnvironment # testTargetAsEnvironment TestShell#testInterval also fails. All failures look like test issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9630) Remove IpcSerializationType
Luke Lu created HADOOP-9630: --- Summary: Remove IpcSerializationType Key: HADOOP-9630 URL: https://issues.apache.org/jira/browse/HADOOP-9630 Project: Hadoop Common Issue Type: Sub-task Reporter: Luke Lu IpcSerializationType is assumed to be protobuf for the forseeable future. Not to be confused with RpcKind which still supports different RpcEngines. Let's remove the dead code, which can be confusing to maintain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9631) ViewFs should use underlying FileSystem's server side defaults
Lohit Vijayarenu created HADOOP-9631: Summary: ViewFs should use underlying FileSystem's server side defaults Key: HADOOP-9631 URL: https://issues.apache.org/jira/browse/HADOOP-9631 Project: Hadoop Common Issue Type: Bug Components: fs, viewfs Affects Versions: 2.0.4-alpha Reporter: Lohit Vijayarenu On a cluster with ViewFS as default FileSystem, creating files using FileContext will always result with replication factor of 1, instead of underlying filesystem default (like HDFS) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-9628) Setup a daily build job for branch-2.1.0-beta
[ https://issues.apache.org/jira/browse/HADOOP-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan resolved HADOOP-9628. Resolution: Fixed jenkins job configured to run daily https://builds.apache.org/job/Hadoop-branch-2.1-beta/ Setup a daily build job for branch-2.1.0-beta - Key: HADOOP-9628 URL: https://issues.apache.org/jira/browse/HADOOP-9628 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 2.1.0-beta Reporter: Hitesh Shah Assignee: Giridharan Kesavan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-9615) Hadoop Jar command not working when used with Spring ORM
[ https://issues.apache.org/jira/browse/HADOOP-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved HADOOP-9615. - Resolution: Invalid I agree that this is not a Hadoop issue. I am going to close this as Invalid. Please re-open if you disagree with reasons why this is a Hadoop issue. Hadoop Jar command not working when used with Spring ORM Key: HADOOP-9615 URL: https://issues.apache.org/jira/browse/HADOOP-9615 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Environment: CentOS, Reporter: Deepa Vasanthkumar Labels: hadoop-2.0 Unable to invoke 'hadoop jar' command for class, which contains Spring persistance unit. The problem is that, the jar file uses Spring ORM for loading the persistance configurations, and based on these configurations, i need to move the files to HDFS. While invoking the jar with hadoop jar command (having spring orm injected) the exception is as: Exception in thread main org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.dao.annotation.PersistenceExceptionTranslationPostProcessor#0' defined in class path resource [applicationContext.xml Error creating bean with name 'entityManagerFactory' defined in class path resource [applicationContext.xml]: Invocation of init method failed; nested exception is java.lang.IllegalStateException: Conflicting persistence unit definitions for name 'Persistance': file:/home/user/Desktop/ABC/apnJar.jar, file:/tmp/hadoop-user/hadoop-unjar2841422106164401019/ Caused by: java.lang.IllegalStateException: Conflicting persistence unit definitions for name 'Persistance': -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9632) TestShellCommandFencer will fail if there is a 'host' machine in the network
Chuan Liu created HADOOP-9632: - Summary: TestShellCommandFencer will fail if there is a 'host' machine in the network Key: HADOOP-9632 URL: https://issues.apache.org/jira/browse/HADOOP-9632 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor TestShellCommandFencer will fail if there is a machine named ‘host’ in the network. The %target_address% environment variable used in the test was from result of InetSocketAddress.getAddress(). The method will return 'host/ip' instead of only 'host' when the host actually exists in the network. When the test comparing the log output, it assumes there is no ip in the address. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop FileSystems + Workshop
I plan to attend. A 9:30 time is a little better for me. sanjay On Jun 5, 2013, at 8:14 PM, Stephen Watt wrote: Hi Folks Per Roman's recommendation I've created a Wiki Page for organizing the work and managing the logistics - https://wiki.apache.org/hadoop/HCFS/Progress I'd like to propose a Google Hangout at 9am PST on Monday June 10th to get together and discuss the initiative. Please respond back to me if you're interested or would like to propose a different time. I'll update our Wiki page with the logistics. Regards Steve Watt - Original Message - From: Roman Shaposhnik shaposh...@gmail.com To: Stephen Watt sw...@redhat.com Cc: common-dev@hadoop.apache.org, mbhandar...@gopivotal.com, shv hadoop shv.had...@gmail.com, ste...@hortonworks.com, erlv5...@gmail.com, apurt...@apache.org Sent: Friday, May 31, 2013 5:28:58 PM Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop FileSystems + Workshop On Fri, May 31, 2013 at 1:00 PM, Stephen Watt sw...@redhat.com wrote: What is the protocol for organizing the logistics and collaborating? I am loathe to flood common-dev with does this time work for you? emails from the interested parties. Do we create a high level JIRA ticket and collaborate and post comments and G+ meetup times on that ? Another option might be the Wiki, I'd be happy to be responsible with tracking progress on https://wiki.apache.org/hadoop/HCFS/Progress until we are able to break initiatives down into more granular JIRA tickets. I'd go with a wiki page and perhaps http://www.doodle.com/ After we've had a few G+ hangouts, for those that would like to meet face to face, I have also made an all day reservation for a meeting room that can hold up to 20 people at our Red Hat Office in Castro Street, Mountain View on Tuesday June 25th (the day before Hadoop Summit and a short drive away). We don't have to use the whole day, but it gives us some flexibility around the availability of interested parties. I was thinking something along the lines of 10am - 3pm. We are happy to cater lunch. That also would be very much appreciated! Thanks, Roman.
[jira] [Created] (HADOOP-9633) An incorrect data node might be added to the network topology, an exception is thrown though
Xi Fang created HADOOP-9633: --- Summary: An incorrect data node might be added to the network topology, an exception is thrown though Key: HADOOP-9633 URL: https://issues.apache.org/jira/browse/HADOOP-9633 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.3.0 Reporter: Xi Fang Priority: Minor In NetworkTopology#add(Node node), an incorrect node may be added to the cluster even if an exception is thrown. This is the original code: {code} if (clusterMap.add(node)) { LOG.info(Adding a new node: +NodeBase.getPath(node)); if (rack == null) { numOfRacks++; } if (!(node instanceof InnerNode)) { if (depthOfAllLeaves == -1) { depthOfAllLeaves = node.getLevel(); } else { if (depthOfAllLeaves != node.getLevel()) { LOG.error(Error: can't add leaf node at depth + node.getLevel() + to topology:\n + oldTopoStr); throw new InvalidTopologyException(Invalid network topology. + You cannot have a rack and a non-rack node at the same + level of the network topology.); } } } {code} This is a potential bug, because a wrong leaf node is already added to the cluster before throwing the exception. However, we can't check this (depthOfAllLeaves != node.getLevel()) before if (clusterMap.add(node)), because node.getLevel() will work correctly only after clusterMap.add(node) has been executed. A possible solution to this is checking the depthOfAllLeaves in clusterMap.add(node). Note that this is a recursive call. A check should be put at the bottom of this recursive call. If check fails, don't add this leaf and all its upstream racks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9634) Duplicate Request Cache in ONCRPC needs to resend response for served requests
Brandon Li created HADOOP-9634: -- Summary: Duplicate Request Cache in ONCRPC needs to resend response for served requests Key: HADOOP-9634 URL: https://issues.apache.org/jira/browse/HADOOP-9634 Project: Hadoop Common Issue Type: Bug Reporter: Brandon Li Assignee: Brandon Li The duplicate request cache can drop repeated request which is still pending, but should send back the response again for served request if the response is cached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira