[jira] [Created] (HADOOP-13439) Fix race between TestMetricsSystemImpl and TestGangliaMetrics
Masatake Iwasaki created HADOOP-13439: - Summary: Fix race between TestMetricsSystemImpl and TestGangliaMetrics Key: HADOOP-13439 URL: https://issues.apache.org/jira/browse/HADOOP-13439 Project: Hadoop Common Issue Type: Bug Components: t, test Reporter: Masatake Iwasaki Priority: Minor TestGangliaMetrics#testGangliaMetrics2 set *.period to 120 but 8 was used. {noformat} 2016-06-27 15:21:31,480 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(375)) - Scheduled snapshot period at 8 second(s). {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: Yes/No newbie question on contributing
I don't have permissions to edit the Wiki, but I've included a link below to my proposed revisions to the How To Contribute page. As a reminder, these changes are meant to make it clear that one does not need to run/pass *all* project unit tests before starting to write code or submit a patch. Knowing this as a newbie would have saved me a lot of time. I'm not sure whether my edits cover the suggestion of instructing folks to run the same checks as are done in the automated precommit builds...I don't know what those checks are. And we had not concluded whether to instruct folks as such or not...thoughts? https://docs.google.com/document/d/1wvGFQ9SgELwCnPanmZ4FmN_uNLW5iyMkVIr3A4CNrTk/edit Best, Martin On Tue, Jul 26, 2016 at 2:58 PM, Martin Rossewrote: > Thanks everyone...that helped. I'll go ahead and edit the Wiki to clarify > the expectation. > > I got a successful build using: > > ~/code/hadoop$ mvn install -DskipTests > > To respond to Vinod's questions: > > > > I think the answer is trunk. I obtained the source code using: > > git clone git://git.apache.org/hadoop.git > > ...and the pom.xml in my source says version 3.0.0-alpha1-SNAPSHOT, and I > haven't tried to do anything with branches yet. > > > > You were right--without knowing any better I was running all the unit > testsso I came across several errors...one error that I was able to fix > was apparently due a newline in the etc/hosts file as I learned from > https://issues.apache.org/jira/browse/HADOOP-10888. After my fix, a > subsequent build passed that unit test. But then a subsequent build to that > build caused that same error again, even thought the newline was fixed. > > Another error I got when running mvn install without -DskipTests is > described in https://issues.apache.org/jira/browse/HADOOP-12611. This is > the type of error I thought would be worthy of ignoring. > > Thanks again for your time--much appreciated! > > -Martin > > > > > On Tue, Jul 26, 2016 at 1:27 PM, Sean Busbey wrote: > >> The current HowToContribute guide expressly tells folks that they >> should ensure all the tests run and pass before and after their >> change. >> >> Sounds like we're due for an update if the expectation is now that >> folks should be using -DskipTests and runs on particular modules. >> Maybe we could instruct folks on running the same checks we'll do in >> the automated precommit builds? >> >> On Tue, Jul 26, 2016 at 1:47 PM, Vinod Kumar Vavilapalli >> wrote: >> > The short answer is that it is expected to pass without any errors. >> > >> > On branch-2.x, that command passes cleanly without any errors though it >> takes north of 10 minutes. Note that I run it with -DskipTests - you don’t >> want to wait for all the unit tests to run, that’ll take too much time. I >> expect trunk to be the same too. >> > >> > Which branch are you running this against? What errors are you seeing? >> If it is unit-tests you are talking about, you can instead run with >> skipTests, run only specific tests or all tests in the module you are >> touching, make sure they pass and then let Jenkins infrastructure run the >> remaining tests when you submit the patch. >> > >> > +Vinod >> > >> >> On Jul 26, 2016, at 11:41 AM, Martin Rosse wrote: >> >> >> >> Hi, >> >> >> >> In the How To Contribute doc, it says: >> >> >> >> "Try getting the project to build and test locally before writing >> code" >> >> >> >> So, just to be 100% certain before I keep troubleshooting things, this >> >> means I should be able to run >> >> >> >> mvn clean install -Pdist -Dtar >> >> >> >> without getting any failures or errors at all...none...zero, right? >> >> >> >> I am surprised at how long this is taking as errors keep cropping up. >> >> Should I just expect it to really take many hours (already at 10+) to >> work >> >> through these issues? I am setting up a dev environment on an Ubuntu >> 14.04 >> >> 64-bit desktop from the AWS marketplace running on EC2. >> >> >> >> It would seem it's an obvious YES answer, but given the time investment >> >> I've been making I just wanted to be absolutely sure before continuing. >> >> >> >> I thought it possible that maybe some errors, depending on their >> nature, >> >> can be overlooked, and that other developers may be doing that in >> practice. >> >> And hence perhaps I should as well to save time. Yes or No?? >> >> >> >> Thank you, >> >> >> >> Martin >> > >> > >> > - >> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org >> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org >> > >> >> >> >> -- >> busbey >> > >
Re: [DISCUSS] Release numbering semantics with concurrent (>2) releases [Was Setting JIRA fix versions for 3.0.0 releases]
I've written up the proposal from my initial reply in a GDoc. I found one bug in the rules when working through my example again, and also incorporated Akira's correction. Thanks all for the discussion so far! https://docs.google.com/document/d/1vlDtpsnSjBPIZiWQjSwgnV0_Z6ZQJ1r91J8G0FduyTg/edit Ping me if you'd like edit/comment privs, or send comments to this thread. I'm eager to close on this so we can keeping pushing on the 2.8.0 and 3.0.0-alpha1 releases. I'd like to post this content somewhere official early next week, so if you have additional feedback, please keep it coming. Best, Andrew On Thu, Jul 28, 2016 at 3:01 PM, Karthik Kambatlawrote: > Inline. > > >> >>> BTW, I never see we have a clear definition for alpha release. It is >>> previously used as unstable in API definition (2.1-alpha, 2.2-alpha, etc.) >>> but sometimes means unstable in production quality (2.7.0). I think we >>> should clearly define it with major consensus so user won't >>> misunderstanding the risky here. >>> >> >> These are the definitions of "alpha" and "beta" used leading up to the >> 2.2 GA release, so it's not something new. These are also the normal >> industry definitions. Alpha means no API compatibility guarantees, early >> software. Beta means API compatible, but still some bugs. >> >> If anything, we never defined the terms "alpha" and "beta" for 2.x >> releases post-2.2 GA. The thinking was that everything after would be >> compatible and thus (at the least) never alpha. I think this is why the >> website talks about the 2.7.x line as "stable" or "unstable" instead, but >> since I think we still guarantee API compatibility between 2.7.0 and 2.7.1, >> we could have just called 2.7.0 "beta". >> >> I think this would be good to have in our compat guidelines or somewhere. >> Happy to work with Karthik/Vinod/others on this. >> > > I am not sure if we formally defined the terms "alpha" and "beta" for > Hadoop 2, but my understanding of them agrees with the general definitions > on the web. > > Alpha: > >- Early version for testing - integration with downstream, deployment >etc. >- Not feature complete >- No compatibility guarantees yet > > Beta: > >- Feature complete >- API compatibility guaranteed >- Need clear definition for other kinds of compatibility (wire, >client-dependencies, server-dependencies etc.) >- Not ready for production deployments > > GA > >- Ready for production >- All the usual compatibility guarantees apply. > > If there is general agreement, I can work towards getting this into our > documentation. > > >> >>> Also, if we treat our 3.0.0-alpha release work seriously, we should also >>> think about trunk's version number issue (bump up to 4.0.0-alpha?) or there >>> could be no room for 3.0 incompatible feature/bits soon. >>> >>> While we're still in alpha for 3.0.0, there's no need for a separate >> 4.0.0 version since there's no guarantee of API compatibility. I plan to >> cut a branch-3 for the beta period, at which point we'll upgrade trunk to >> 4.0.0-alpha1. This is something we discussed on another mailing list thread. >> > > Branching at beta time seems reasonable. > > Overall, are there any incompatible changes on trunk that we wouldn't be > comfortable shipping in 3.0.0. If yes, do we feel comfortable shipping > those bits ever? > > >> >> Best, >> Andrew >> > >
Updated 2.8.0-SNAPSHOT artifact
Latest snapshot is uploaded in Nov 2015, but checkins are still coming in quite frequently. https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-yarn-api/ Are there any plans to start producing updated SNAPSHOT artifacts for current hadoop development lines?
Re: Heads up: branched branch-3.0.0-alpha1
Looking at the state of branch-3.0.0-alpha1 and the fix versions, we're already out of sync. I think the easiest solution is to (close to the release date) rebranch and change any 3.0.0-alpha2 fix versions to 3.0.0-alpha1. I think the versioning discussions are converging, so hopefully soon. I'll send another email when this happens. On Fri, Jul 15, 2016 at 7:26 PM, Andrew Wangwrote: > Hi all, > > You might have already noticed from the bulk JIRA updates, but I've > branched branch-3.0.0-alpha1 off trunk, and updated trunk to be > 3.0.0-alpha2. For most changes, you can just commit to trunk and the > branch-2s. Just remember to use the new 3.0.0-alpha2 version where > appropriate. > > I still need to update the markdown release notes and double check things, > but hopefully an RC0 will be coming down the pipe soon. > > Thanks, > Andrew >
Re: [DISCUSS] Release numbering semantics with concurrent (>2) releases [Was Setting JIRA fix versions for 3.0.0 releases]
Inline. > >> BTW, I never see we have a clear definition for alpha release. It is >> previously used as unstable in API definition (2.1-alpha, 2.2-alpha, etc.) >> but sometimes means unstable in production quality (2.7.0). I think we >> should clearly define it with major consensus so user won't >> misunderstanding the risky here. >> > > These are the definitions of "alpha" and "beta" used leading up to the 2.2 > GA release, so it's not something new. These are also the normal industry > definitions. Alpha means no API compatibility guarantees, early software. > Beta means API compatible, but still some bugs. > > If anything, we never defined the terms "alpha" and "beta" for 2.x > releases post-2.2 GA. The thinking was that everything after would be > compatible and thus (at the least) never alpha. I think this is why the > website talks about the 2.7.x line as "stable" or "unstable" instead, but > since I think we still guarantee API compatibility between 2.7.0 and 2.7.1, > we could have just called 2.7.0 "beta". > > I think this would be good to have in our compat guidelines or somewhere. > Happy to work with Karthik/Vinod/others on this. > I am not sure if we formally defined the terms "alpha" and "beta" for Hadoop 2, but my understanding of them agrees with the general definitions on the web. Alpha: - Early version for testing - integration with downstream, deployment etc. - Not feature complete - No compatibility guarantees yet Beta: - Feature complete - API compatibility guaranteed - Need clear definition for other kinds of compatibility (wire, client-dependencies, server-dependencies etc.) - Not ready for production deployments GA - Ready for production - All the usual compatibility guarantees apply. If there is general agreement, I can work towards getting this into our documentation. > >> Also, if we treat our 3.0.0-alpha release work seriously, we should also >> think about trunk's version number issue (bump up to 4.0.0-alpha?) or there >> could be no room for 3.0 incompatible feature/bits soon. >> >> While we're still in alpha for 3.0.0, there's no need for a separate > 4.0.0 version since there's no guarantee of API compatibility. I plan to > cut a branch-3 for the beta period, at which point we'll upgrade trunk to > 4.0.0-alpha1. This is something we discussed on another mailing list thread. > Branching at beta time seems reasonable. Overall, are there any incompatible changes on trunk that we wouldn't be comfortable shipping in 3.0.0. If yes, do we feel comfortable shipping those bits ever? > > Best, > Andrew >
[jira] [Created] (HADOOP-13438) Optimize IPC server protobuf decoding
Daryn Sharp created HADOOP-13438: Summary: Optimize IPC server protobuf decoding Key: HADOOP-13438 URL: https://issues.apache.org/jira/browse/HADOOP-13438 Project: Hadoop Common Issue Type: Sub-task Reporter: Daryn Sharp Assignee: Daryn Sharp The current use of the protobuf API uses an expensive code path. The builder uses the parser to instantiate a message, then copies the message into the builder. The parser is creating multi-layered internally buffering streams that cause excessive byte[] allocations. Using the parser directly with a coded input stream backed by the byte[] from the wire will take a fast-path straight to the pb message's ctor. Substantially less garbage is generated. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/31/ [Jul 21, 2016 3:38:20 AM] (lei) HADOOP-12928. Update netty to 3.10.5.Final to sync with zookeeper. (lei) [Jul 21, 2016 6:50:47 AM] (rohithsharmaks) YARN-1126. Add validation of users input nodes-states options to nodes [Jul 21, 2016 7:17:27 AM] (rohithsharmaks) YARN-5092. TestRMDelegationTokens fails intermittently. Contributed by [Jul 21, 2016 6:14:39 PM] (jing9) HDFS-10653. Optimize conversion from path string to components. [Jul 21, 2016 6:26:08 PM] (aajisaka) HDFS-10287. MiniDFSCluster should implement AutoCloseable. Contributed [Jul 21, 2016 6:34:48 PM] (aajisaka) MAPREDUCE-6738. TestJobListCache.testAddExisting failed intermittently [Jul 21, 2016 9:12:31 PM] (cnauroth) HADOOP-13240. TestAclCommands.testSetfaclValidations fail. Contributed [Jul 21, 2016 9:43:57 PM] (mfoley) HADOOP-13382. Remove unneeded commons-httpclient dependencies from POM [Jul 21, 2016 11:41:02 PM] (xiao) HDFS-10225. DataNode hot swap drives should disallow storage type [Jul 22, 2016 6:21:47 AM] (cdouglas) HADOOP-13393. Omit unsupported fs.defaultFS setting in ADLS [Jul 22, 2016 4:16:38 PM] (cnauroth) HADOOP-13392. [Azure Data Lake] OAuth2 configuration should be default [Jul 22, 2016 6:08:20 PM] (uma.gangumalla) HDFS-10565: Erasure Coding: Document about the current allowed storage [Jul 22, 2016 7:33:50 PM] (arp) HDFS-10660. Expose storage policy apis via HDFSAdmin interface. [Jul 22, 2016 10:38:18 PM] (cnauroth) HADOOP-13207. Specify FileSystem listStatus, listFiles and [Jul 23, 2016 1:08:12 AM] (cdouglas) HADOOP-13272. ViewFileSystem should support storage policy related API. [Jul 23, 2016 9:45:33 AM] (kai.zheng) HDFS-10651. Clean up some configuration related codes about legacy block [Jul 23, 2016 5:00:08 PM] (stevel) HADOOP-13389 TestS3ATemporaryCredentials.testSTS error when using IAM [Jul 25, 2016 1:45:03 PM] (stevel) HADOOP-13406 S3AFileSystem: Consider reusing filestatus in delete() and [Jul 25, 2016 2:50:23 PM] (stevel) HADOOP-13188 S3A file-create should throw error rather than overwrite [Jul 25, 2016 9:54:48 PM] (jlowe) MAPREDUCE-6744. Increase timeout on TestDFSIO tests. Contributed by Eric [Jul 25, 2016 11:37:50 PM] (cdouglas) YARN-5164. Use plan RLE to improve CapacityOverTimePolicy efficiency [Jul 26, 2016 1:41:13 AM] (jing9) HDFS-10688. BPServiceActor may run into a tight loop for sending block [Jul 26, 2016 1:48:21 AM] (iwasakims) HDFS-10671. Fix typo in HdfsRollingUpgrade.md. Contributed by Yiqun Lin. [Jul 26, 2016 1:50:59 AM] (shv) HDFS-10301. Interleaving processing of storages from repeated block [Jul 26, 2016 5:24:24 AM] (brahma) HDFS-10668. Fix intermittently failing UT [Jul 26, 2016 1:30:02 PM] (stevel) Revert "HDFS-10668. Fix intermittently failing UT [Jul 26, 2016 1:53:37 PM] (kai.zheng) HADOOP-13041. Adding tests for coder utilities. Contributed by Kai [Jul 26, 2016 3:01:42 PM] (weichiu) HDFS-9937. Update dfsadmin command line help and HdfsQuotaAdminGuide. [Jul 26, 2016 3:19:06 PM] (varunsaxena) YARN-5431. TimelineReader daemon start should allow to pass its own [Jul 26, 2016 3:43:12 PM] (varunsaxena) Revert "YARN-5431. TimelineReader daemon start should allow to pass its [Jul 26, 2016 7:27:46 PM] (arp) HDFS-10642. [Jul 26, 2016 9:54:03 PM] (Arun Suresh) YARN-5392. Replace use of Priority in the Scheduling infrastructure with [Jul 26, 2016 10:33:20 PM] (cnauroth) HADOOP-13422. ZKDelegationTokenSecretManager JaasConfig does not work [Jul 26, 2016 11:01:50 PM] (weichiu) HDFS-10598. DiskBalancer does not execute multi-steps plan. Contributed [Jul 27, 2016 1:14:09 AM] (wangda) YARN-5342. Improve non-exclusive node partition resource allocation in [Jul 27, 2016 2:08:30 AM] (Arun Suresh) YARN-5351. ResourceRequest should take ExecutionType into account during [Jul 27, 2016 4:22:59 AM] (wangda) YARN-5195. RM intermittently crashed with NPE while handling [Jul 27, 2016 4:56:42 AM] (brahma) HDFS-10668. Fix intermittently failing UT [Jul 27, 2016 10:41:09 AM] (aajisaka) HADOOP-9427. Use JUnit assumptions to skip platform-specific tests. [Jul 27, 2016 8:58:04 PM] (yzhang) HDFS-10667. Report more accurate info about data corruption location. [Jul 27, 2016 10:50:38 PM] (cnauroth) HADOOP-13354. Update WASB driver to use the latest version (4.2.0) of [Jul 28, 2016 12:55:41 AM] (wang) HDFS-10519. Add a configuration option to enable in-progress edit log [Jul 28, 2016 1:21:58 AM] (subru) YARN-5441. Fixing minor Scheduler test case failures [Jul 28, 2016 3:06:09 AM] (varunsaxena) YARN-5431. TimelineReader daemon start should allow to pass its own [Jul 28, 2016 7:58:23 AM] (aajisaka) HDFS-10696. TestHDFSCLI fails. Contributed by Kai Sasaki. [Jul 28, 2016 1:35:24 PM] (junping_du) YARN-5432. Lock already held by another process while LevelDB cache [Jul 28, 2016 5:23:18 PM] (gtcarrera9) YARN-5440. Use AHSClient in YarnClient when TimelineServer is running.
[jira] [Created] (HADOOP-13437) KMS should reload whitelist and default key ACLs when hot-reloading
Xiao Chen created HADOOP-13437: -- Summary: KMS should reload whitelist and default key ACLs when hot-reloading Key: HADOOP-13437 URL: https://issues.apache.org/jira/browse/HADOOP-13437 Project: Hadoop Common Issue Type: Bug Components: kms Affects Versions: 2.6.0 Reporter: Xiao Chen Assignee: Xiao Chen When hot-reloading, {{KMSACLs#setKeyACLs}} ignores whitelist and default key entries if they're present in memory. We should reload them, hot-reload and cold-start should not have any difference in behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: Wiki migration and clean-up
Good point Allen. I expect that moving everything to the new wiki will take a while. Once that's done, the code can be changed. Just doing a quick grep, I see a total of 12 places to change to point to the new Wiki (there may be more). For existing installs, we can either keep a small subset of the old pages or add redirects/pointers from the old wiki to the new location. -Ray On 7/28/16 11:07 AM, Allen Wittenauer wrote: I hope you folks are aware that this is much more intensive than just moving a bunch of documents. Lots of wiki pages are referenced in the source code, including in user-facing error messages. On Jul 28, 2016, at 10:47 AM, Ray Chiangwrote: Thanks Martin. I did ask on INFRA-12342, and it looks like Confluence Wiki is the recommended "latest and greatest". Here's my proposal as it currently stands: 1) Move to Confluence Wiki. 2) Move all the Industry/meetup to a single page with a small set of external links. This will be mostly of the form, "if you want to know more you can get started with...". 3) Have one other page for users just getting started. The updated IRC information, mailing lists, and the fact that JIRA isn't for user support will go here. 4) Keep and reorganize the more detailed technical information (developers, advanced users, and admins) on the Wiki. For this, I have no doubt I'll be copying large chunks of the old Wiki, but likely updating any pre-branch-2 information. 5) Once everything is moved, organized, and gets enough +1's from the community, update the pointers to the new Wiki and obsolete the old one. Any further discussion is still welcome. -Ray On 7/27/16 12:08 PM, Martin Rosse wrote: Hi Ray, The migration is much needed, and thanks for initiating it. Regarding approaches to cleaning up the Wiki content--my 2 cents is in favor an approach similar to the Spark cwiki: https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage My take is that the Hadoop product docs on hadoop.apache.org generally target (or should target) the audiences you describe in 1-4, while the Wiki is (should be) primarily for audience #5 or "Hadoop staff"--internal Hadoop development, product management, QA, etc. Definitely current Wiki content such as "Overview of Hadoop" and the link to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc maintenance, and annoying to come across as a user because you have to assess its value relative to the same/similar content in the product doc on hadoop.apache.org. BTW, I did some random testing of ASF project wikis hosted on cwiki.apache.org, and the pages for those sites definitely load much, much faster than ASF wiki pages using MoinMoin. Clearly no surprise. Best, Martin On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang wrote: Good to know. It's certainly easier to set up an alternate location in any case and then do a wholesale migration. It saves from having that "under construction" look before it's complete. I'll get on the appropriate infra@ list and ask about recommendations. -Ray On 7/26/16 10:49 PM, Andrew Wang wrote: Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed this INFRA JIRA about the wiki being terribly slow, and they closed it as WONTFIX: https://issues.apache.org/jira/browse/INFRA-12283 So if you'd actually like to undertake a wiki cleanup, we should also consider migrating the content to a wiki that isn't terribly slow. I think cwiki.apache.org is better, but maybe we should ask infra what the preferred option is here. They might be able to help with a content migration too. On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang wrote: Coming in late to an old thread. I was looking around at the Hadoop documentation (hadoop.apache.org and wiki.apache.org/hadoop) and I'd sum up the current state of the documentation as follows: 1. hadoop.apache.org is pretty clearly full of technical information. My only minor nit here is that the wiki pointer and the Git pointer at the top is really tiny. 2. wiki.apache.org is simultaneously targeted to at least four audiences 1. Industry Users (broadest sense of Big Data Industry) 2. Industry Developers (mostly those adding a layer like Hive does to MapReduce) 3. Hadoop Users (those who just want to set up a small cluster) 4. Hadoop Developers (e.g. using MapReduce APIs) 5. Hadoop Internal Developers (eventual contributors) I'd like to initiate some cleanup of the wiki, but before I even start, I'd like to see if anyone has constructive suggestions or other approaches that would make this transition smoother. 1. Some sections, like Industry Users and Industry Developers is growing so fast, I'm not sure whether it's worth maintaining in any meaningful format. I'd be inclined to make suggestions on where to start and let Google take them forward from there. 2.
Re: Wiki migration and clean-up
I hope you folks are aware that this is much more intensive than just moving a bunch of documents. Lots of wiki pages are referenced in the source code, including in user-facing error messages. > On Jul 28, 2016, at 10:47 AM, Ray Chiangwrote: > > Thanks Martin. I did ask on INFRA-12342, and it looks like Confluence Wiki > is the recommended "latest and greatest". > > Here's my proposal as it currently stands: > > 1) Move to Confluence Wiki. > > 2) Move all the Industry/meetup to a single page with a small set of external > links. This will be mostly of the form, "if you want to know more you can > get started with...". > > 3) Have one other page for users just getting started. The updated IRC > information, mailing lists, and the fact that JIRA isn't for user support > will go here. > > 4) Keep and reorganize the more detailed technical information (developers, > advanced users, and admins) on the Wiki. For this, I have no doubt I'll be > copying large chunks of the old Wiki, but likely updating any pre-branch-2 > information. > > 5) Once everything is moved, organized, and gets enough +1's from the > community, update the pointers to the new Wiki and obsolete the old one. > > Any further discussion is still welcome. > > -Ray > > > On 7/27/16 12:08 PM, Martin Rosse wrote: >> Hi Ray, >> >> The migration is much needed, and thanks for initiating it. >> >> Regarding approaches to cleaning up the Wiki content--my 2 cents is in >> favor an approach similar to the Spark cwiki: >> >> https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage >> >> My take is that the Hadoop product docs on hadoop.apache.org generally >> target (or should target) the audiences you describe in 1-4, while the Wiki >> is (should be) primarily for audience #5 or "Hadoop staff"--internal Hadoop >> development, product management, QA, etc. >> >> Definitely current Wiki content such as "Overview of Hadoop" and the link >> to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc >> maintenance, and annoying to come across as a user because you have to >> assess its value relative to the same/similar content in the product doc on >> hadoop.apache.org. >> >> BTW, I did some random testing of ASF project wikis hosted on >> cwiki.apache.org, and the pages for those sites definitely load much, much >> faster than ASF wiki pages using MoinMoin. Clearly no surprise. >> >> Best, >> Martin >> >> >> On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang wrote: >> >>> Good to know. It's certainly easier to set up an alternate location in >>> any case and then do a wholesale migration. It saves from having that >>> "under construction" look before it's complete. >>> >>> I'll get on the appropriate infra@ list and ask about recommendations. >>> >>> -Ray >>> >>> >>> On 7/26/16 10:49 PM, Andrew Wang wrote: >>> Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed this INFRA JIRA about the wiki being terribly slow, and they closed it as WONTFIX: https://issues.apache.org/jira/browse/INFRA-12283 So if you'd actually like to undertake a wiki cleanup, we should also consider migrating the content to a wiki that isn't terribly slow. I think cwiki.apache.org is better, but maybe we should ask infra what the preferred option is here. They might be able to help with a content migration too. On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang wrote: Coming in late to an old thread. > I was looking around at the Hadoop documentation (hadoop.apache.org and > wiki.apache.org/hadoop) and I'd sum up the current state of the > documentation as follows: > > 1. hadoop.apache.org is pretty clearly full of technical information. > My only minor nit here is that the wiki pointer and the Git pointer > at the top is really tiny. > 2. wiki.apache.org is simultaneously targeted to at least four audiences > 1. Industry Users (broadest sense of Big Data Industry) > 2. Industry Developers (mostly those adding a layer like Hive does > to MapReduce) > 3. Hadoop Users (those who just want to set up a small cluster) > 4. Hadoop Developers (e.g. using MapReduce APIs) > 5. Hadoop Internal Developers (eventual contributors) > > I'd like to initiate some cleanup of the wiki, but before I even start, > I'd like to see if anyone has constructive suggestions or other > approaches > that would make this transition smoother. > > 1. Some sections, like Industry Users and Industry Developers is > growing so fast, I'm not sure whether it's worth maintaining in any > meaningful format. I'd be inclined to make suggestions on where to > start and let Google take them forward from there. > 2. Organize the developer
Re: Wiki migration and clean-up
Big +1 from me. Better docs are incredibly helpful for our users and new contributors. This cleanup would be a great contribution. If anyone else is looking for a side project, the website could also badly use a refresh. There aren't actually that many pages: -> % find author -name "*.xml" author/src/documentation/skinconf.xml author/src/documentation/content/xdocs/index.xml author/src/documentation/content/xdocs/releases.xml author/src/documentation/content/xdocs/who.xml author/src/documentation/content/xdocs/privacy_policy.xml author/src/documentation/content/xdocs/site.xml author/src/documentation/content/xdocs/version_control.xml author/src/documentation/content/xdocs/tabs.xml author/src/documentation/content/xdocs/bylaws.xml author/src/documentation/content/xdocs/issue_tracking.xml author/src/documentation/content/xdocs/mailing_lists.xml author/src/documentation/skins/common/translations/CommonMessages_es.xml author/src/documentation/skins/common/translations/CommonMessages_en_US.xml author/src/documentation/skins/common/translations/CommonMessages_fr.xml author/src/documentation/skins/common/translations/CommonMessages_de.xml On Thu, Jul 28, 2016 at 10:47 AM, Ray Chiangwrote: > Thanks Martin. I did ask on INFRA-12342, and it looks like Confluence > Wiki is the recommended "latest and greatest". > > Here's my proposal as it currently stands: > > 1) Move to Confluence Wiki. > > 2) Move all the Industry/meetup to a single page with a small set of > external links. This will be mostly of the form, "if you want to know more > you can get started with...". > > 3) Have one other page for users just getting started. The updated IRC > information, mailing lists, and the fact that JIRA isn't for user support > will go here. > > 4) Keep and reorganize the more detailed technical information > (developers, advanced users, and admins) on the Wiki. For this, I have no > doubt I'll be copying large chunks of the old Wiki, but likely updating any > pre-branch-2 information. > > 5) Once everything is moved, organized, and gets enough +1's from the > community, update the pointers to the new Wiki and obsolete the old one. > > Any further discussion is still welcome. > > -Ray > > > On 7/27/16 12:08 PM, Martin Rosse wrote: > >> Hi Ray, >> >> The migration is much needed, and thanks for initiating it. >> >> Regarding approaches to cleaning up the Wiki content--my 2 cents is in >> favor an approach similar to the Spark cwiki: >> >> https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage >> >> My take is that the Hadoop product docs on hadoop.apache.org generally >> target (or should target) the audiences you describe in 1-4, while the >> Wiki >> is (should be) primarily for audience #5 or "Hadoop staff"--internal >> Hadoop >> development, product management, QA, etc. >> >> Definitely current Wiki content such as "Overview of Hadoop" and the link >> to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc >> maintenance, and annoying to come across as a user because you have to >> assess its value relative to the same/similar content in the product doc >> on >> hadoop.apache.org. >> >> BTW, I did some random testing of ASF project wikis hosted on >> cwiki.apache.org, and the pages for those sites definitely load much, >> much >> faster than ASF wiki pages using MoinMoin. Clearly no surprise. >> >> Best, >> Martin >> >> >> On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang wrote: >> >> Good to know. It's certainly easier to set up an alternate location in >>> any case and then do a wholesale migration. It saves from having that >>> "under construction" look before it's complete. >>> >>> I'll get on the appropriate infra@ list and ask about recommendations. >>> >>> -Ray >>> >>> >>> On 7/26/16 10:49 PM, Andrew Wang wrote: >>> >>> Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed this INFRA JIRA about the wiki being terribly slow, and they closed it as WONTFIX: https://issues.apache.org/jira/browse/INFRA-12283 So if you'd actually like to undertake a wiki cleanup, we should also consider migrating the content to a wiki that isn't terribly slow. I think cwiki.apache.org is better, but maybe we should ask infra what the preferred option is here. They might be able to help with a content migration too. On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang wrote: Coming in late to an old thread. > I was looking around at the Hadoop documentation (hadoop.apache.org > and > wiki.apache.org/hadoop) and I'd sum up the current state of the > documentation as follows: > > 1. hadoop.apache.org is pretty clearly full of technical information. > My only minor nit here is that the wiki pointer and the Git pointer > at the top is really tiny. > 2. wiki.apache.org is simultaneously targeted to at least four
[jira] [Created] (HADOOP-13436) RPC connections are leaking due to missing equals override in RetryUtils#getDefaultRetryPolicy
Xiaobing Zhou created HADOOP-13436: -- Summary: RPC connections are leaking due to missing equals override in RetryUtils#getDefaultRetryPolicy Key: HADOOP-13436 URL: https://issues.apache.org/jira/browse/HADOOP-13436 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.7.1 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13435) Add thread local mechanism for aggregating file system storage stats
Mingliang Liu created HADOOP-13435: -- Summary: Add thread local mechanism for aggregating file system storage stats Key: HADOOP-13435 URL: https://issues.apache.org/jira/browse/HADOOP-13435 Project: Hadoop Common Issue Type: Sub-task Components: fs Reporter: Mingliang Liu Assignee: Mingliang Liu As discussed in [HADOOP-13032], this is to add thread local mechanism for aggregating file system storage stats. This class will also be used in [HADOOP-13031], which is to separate the distance-oriented rack-aware read bytes logic from {{FileSystemStorageStatistics}} to new DFSRackAwareStorageStatistics as it's DFS-specific. After this patch, the {{FileSystemStorageStatistics}} can live without the to-be-removed {{FileSystem$Statistics}} implementation. A unit test should also be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: Wiki migration and clean-up
Thanks Martin. I did ask on INFRA-12342, and it looks like Confluence Wiki is the recommended "latest and greatest". Here's my proposal as it currently stands: 1) Move to Confluence Wiki. 2) Move all the Industry/meetup to a single page with a small set of external links. This will be mostly of the form, "if you want to know more you can get started with...". 3) Have one other page for users just getting started. The updated IRC information, mailing lists, and the fact that JIRA isn't for user support will go here. 4) Keep and reorganize the more detailed technical information (developers, advanced users, and admins) on the Wiki. For this, I have no doubt I'll be copying large chunks of the old Wiki, but likely updating any pre-branch-2 information. 5) Once everything is moved, organized, and gets enough +1's from the community, update the pointers to the new Wiki and obsolete the old one. Any further discussion is still welcome. -Ray On 7/27/16 12:08 PM, Martin Rosse wrote: Hi Ray, The migration is much needed, and thanks for initiating it. Regarding approaches to cleaning up the Wiki content--my 2 cents is in favor an approach similar to the Spark cwiki: https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage My take is that the Hadoop product docs on hadoop.apache.org generally target (or should target) the audiences you describe in 1-4, while the Wiki is (should be) primarily for audience #5 or "Hadoop staff"--internal Hadoop development, product management, QA, etc. Definitely current Wiki content such as "Overview of Hadoop" and the link to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc maintenance, and annoying to come across as a user because you have to assess its value relative to the same/similar content in the product doc on hadoop.apache.org. BTW, I did some random testing of ASF project wikis hosted on cwiki.apache.org, and the pages for those sites definitely load much, much faster than ASF wiki pages using MoinMoin. Clearly no surprise. Best, Martin On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiangwrote: Good to know. It's certainly easier to set up an alternate location in any case and then do a wholesale migration. It saves from having that "under construction" look before it's complete. I'll get on the appropriate infra@ list and ask about recommendations. -Ray On 7/26/16 10:49 PM, Andrew Wang wrote: Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed this INFRA JIRA about the wiki being terribly slow, and they closed it as WONTFIX: https://issues.apache.org/jira/browse/INFRA-12283 So if you'd actually like to undertake a wiki cleanup, we should also consider migrating the content to a wiki that isn't terribly slow. I think cwiki.apache.org is better, but maybe we should ask infra what the preferred option is here. They might be able to help with a content migration too. On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang wrote: Coming in late to an old thread. I was looking around at the Hadoop documentation (hadoop.apache.org and wiki.apache.org/hadoop) and I'd sum up the current state of the documentation as follows: 1. hadoop.apache.org is pretty clearly full of technical information. My only minor nit here is that the wiki pointer and the Git pointer at the top is really tiny. 2. wiki.apache.org is simultaneously targeted to at least four audiences 1. Industry Users (broadest sense of Big Data Industry) 2. Industry Developers (mostly those adding a layer like Hive does to MapReduce) 3. Hadoop Users (those who just want to set up a small cluster) 4. Hadoop Developers (e.g. using MapReduce APIs) 5. Hadoop Internal Developers (eventual contributors) I'd like to initiate some cleanup of the wiki, but before I even start, I'd like to see if anyone has constructive suggestions or other approaches that would make this transition smoother. 1. Some sections, like Industry Users and Industry Developers is growing so fast, I'm not sure whether it's worth maintaining in any meaningful format. I'd be inclined to make suggestions on where to start and let Google take them forward from there. 2. Organize the developer section based on the pieces a new reader wants to learn (new to everything, new to Hadoop, all the tools for Hadoop development, "just check out code and go", etc). 3. Organize the Users section a bit more. The "Setting up a Hadoop Cluster" is grouped well, but I'd perhaps rearrange the ordering a bit. -Ray - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Testing Hadoop based system with large sets of data
Hi How do i test ingestion speed of my system? i want to test with particular types of data Are there any tools, which generate this type of data? Thanks in advance Basavaraj - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/ [Jul 27, 2016 10:41:09 AM] (aajisaka) HADOOP-9427. Use JUnit assumptions to skip platform-specific tests. [Jul 27, 2016 8:58:04 PM] (yzhang) HDFS-10667. Report more accurate info about data corruption location. [Jul 27, 2016 10:50:38 PM] (cnauroth) HADOOP-13354. Update WASB driver to use the latest version (4.2.0) of [Jul 28, 2016 12:55:41 AM] (wang) HDFS-10519. Add a configuration option to enable in-progress edit log [Jul 28, 2016 1:21:58 AM] (subru) YARN-5441. Fixing minor Scheduler test case failures [Jul 28, 2016 3:06:09 AM] (varunsaxena) YARN-5431. TimelineReader daemon start should allow to pass its own -1 overall The following subsystems voted -1: asflicense unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.cli.TestHDFSCLI hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics hadoop.yarn.server.nodemanager.TestDirectoryCollection hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.client.api.impl.TestYarnClient hadoop.mapred.gridmix.TestLoadJob cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-compile-javac-root.txt [172K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-checkstyle-root.txt [16M] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-patch-pylint.txt [16K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-patch-shelldocs.txt [16K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/whitespace-tabs.txt [1.3M] javadoc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-javadoc-javadoc-root.txt [2.3M] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [144K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt [36K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt [268K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt [124K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-tools_hadoop-gridmix.txt [16K] asflicense: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-asflicense-problems.txt [4.0K] Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org