date:20160728

[jira] [Created] (HADOOP-13439) Fix race between TestMetricsSystemImpl and TestGangliaMetrics

2016-07-28 Thread Masatake Iwasaki (JIRA)

Masatake Iwasaki created HADOOP-13439:
-

 Summary: Fix race between TestMetricsSystemImpl and 
TestGangliaMetrics
 Key: HADOOP-13439
 URL: https://issues.apache.org/jira/browse/HADOOP-13439
 Project: Hadoop Common
  Issue Type: Bug
  Components: t, test
Reporter: Masatake Iwasaki
Priority: Minor


TestGangliaMetrics#testGangliaMetrics2 set *.period to 120 but 8 was used.
{noformat}
2016-06-27 15:21:31,480 INFO  impl.MetricsSystemImpl 
(MetricsSystemImpl.java:startTimer(375)) - Scheduled snapshot period at 8 
second(s).
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: Yes/No newbie question on contributing

2016-07-28 Thread Martin Rosse

I don't have permissions to edit the Wiki, but I've included a link below
to my proposed revisions to the How To Contribute page. As a reminder,
these changes are meant to make it clear that one does not need to run/pass
*all* project unit tests before starting to write code or submit a patch.
Knowing this as a newbie would have saved me a lot of time.

I'm not sure whether my edits cover the suggestion of instructing folks to
run the same checks as are done in the automated precommit builds...I don't
know what those checks are. And we had not concluded whether to instruct
folks as such or not...thoughts?

https://docs.google.com/document/d/1wvGFQ9SgELwCnPanmZ4FmN_uNLW5iyMkVIr3A4CNrTk/edit

Best,
Martin


On Tue, Jul 26, 2016 at 2:58 PM, Martin Rosse  wrote:

> Thanks everyone...that helped. I'll go ahead and edit the Wiki to clarify
> the expectation.
>
> I got a successful build using:
>
> ~/code/hadoop$  mvn install -DskipTests
>
> To respond to Vinod's questions:
>
> 
>
> I think the answer is trunk. I obtained the source code using:
>
> git clone git://git.apache.org/hadoop.git
>
> ...and the pom.xml in my source says version 3.0.0-alpha1-SNAPSHOT, and I
> haven't tried to do anything with branches yet.
>
> 
>
> You were right--without knowing any better I was running all the unit
> testsso I came across several errors...one error that I was able to fix
> was apparently due a newline in the etc/hosts file as I learned from
> https://issues.apache.org/jira/browse/HADOOP-10888. After my fix, a
> subsequent build passed that unit test. But then a subsequent build to that
> build caused that same error again, even thought the newline was fixed.
>
> Another error I got when running mvn install without -DskipTests is
> described in https://issues.apache.org/jira/browse/HADOOP-12611. This is
> the type of error I thought would be worthy of ignoring.
>
> Thanks again for your time--much appreciated!
>
> -Martin
>
>
>
>
> On Tue, Jul 26, 2016 at 1:27 PM, Sean Busbey  wrote:
>
>> The current HowToContribute guide expressly tells folks that they
>> should ensure all the tests run and pass before and after their
>> change.
>>
>> Sounds like we're due for an update if the expectation is now that
>> folks should be using -DskipTests and runs on particular modules.
>> Maybe we could instruct folks on running the same checks we'll do in
>> the automated precommit builds?
>>
>> On Tue, Jul 26, 2016 at 1:47 PM, Vinod Kumar Vavilapalli
>>  wrote:
>> > The short answer is that it is expected to pass without any errors.
>> >
>> > On branch-2.x, that command passes cleanly without any errors though it
>> takes north of 10 minutes. Note that I run it with -DskipTests - you don’t
>> want to wait for all the unit tests to run, that’ll take too much time. I
>> expect trunk to be the same too.
>> >
>> > Which branch are you running this against? What errors are you seeing?
>> If it is unit-tests you are talking about, you can instead run with
>> skipTests, run only specific tests or all tests in the module you are
>> touching, make sure they pass and then let Jenkins infrastructure run the
>> remaining tests when you submit the patch.
>> >
>> > +Vinod
>> >
>> >> On Jul 26, 2016, at 11:41 AM, Martin Rosse  wrote:
>> >>
>> >> Hi,
>> >>
>> >> In the How To Contribute doc, it says:
>> >>
>> >> "Try getting the project to build and test locally before writing
>> code"
>> >>
>> >> So, just to be 100% certain before I keep troubleshooting things, this
>> >> means I should be able to run
>> >>
>> >> mvn clean install -Pdist -Dtar
>> >>
>> >> without getting any failures or errors at all...none...zero, right?
>> >>
>> >> I am surprised at how long this is taking as errors keep cropping up.
>> >> Should I just expect it to really take many hours (already at 10+) to
>> work
>> >> through these issues? I am setting up a dev environment on an Ubuntu
>> 14.04
>> >> 64-bit desktop from the AWS marketplace running on EC2.
>> >>
>> >> It would seem it's an obvious YES answer, but given the time investment
>> >> I've been making I just wanted to be absolutely sure before continuing.
>> >>
>> >> I thought it possible that maybe some errors, depending on their
>> nature,
>> >> can be overlooked, and that other developers may be doing that in
>> practice.
>> >> And hence perhaps I should as well to save time. Yes or No??
>> >>
>> >> Thank you,
>> >>
>> >> Martin
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> >
>>
>>
>>
>> --
>> busbey
>>
>
>

Re: [DISCUSS] Release numbering semantics with concurrent (>2) releases [Was Setting JIRA fix versions for 3.0.0 releases]

2016-07-28 Thread Andrew Wang

I've written up the proposal from my initial reply in a GDoc. I found one
bug in the rules when working through my example again, and also
incorporated Akira's correction. Thanks all for the discussion so far!

https://docs.google.com/document/d/1vlDtpsnSjBPIZiWQjSwgnV0_Z6ZQJ1r91J8G0FduyTg/edit

Ping me if you'd like edit/comment privs, or send comments to this thread.

I'm eager to close on this so we can keeping pushing on the 2.8.0 and
3.0.0-alpha1 releases. I'd like to post this content somewhere official
early next week, so if you have additional feedback, please keep it coming.

Best,
Andrew

On Thu, Jul 28, 2016 at 3:01 PM, Karthik Kambatla 
wrote:

> Inline.
>
>
>>
>>> BTW, I never see we have a clear definition for alpha release. It is
>>> previously used as unstable in API definition (2.1-alpha, 2.2-alpha, etc.)
>>> but sometimes means unstable in production quality (2.7.0). I think we
>>> should clearly define it with major consensus so user won't
>>> misunderstanding the risky here.
>>>
>>
>> These are the definitions of "alpha" and "beta" used leading up to the
>> 2.2 GA release, so it's not something new. These are also the normal
>> industry definitions. Alpha means no API compatibility guarantees, early
>> software. Beta means API compatible, but still some bugs.
>>
>> If anything, we never defined the terms "alpha" and "beta" for 2.x
>> releases post-2.2 GA. The thinking was that everything after would be
>> compatible and thus (at the least) never alpha. I think this is why the
>> website talks about the 2.7.x line as "stable" or "unstable" instead, but
>> since I think we still guarantee API compatibility between 2.7.0 and 2.7.1,
>> we could have just called 2.7.0 "beta".
>>
>> I think this would be good to have in our compat guidelines or somewhere.
>> Happy to work with Karthik/Vinod/others on this.
>>
>
> I am not sure if we formally defined the terms "alpha" and "beta" for
> Hadoop 2, but my understanding of them agrees with the general definitions
> on the web.
>
> Alpha:
>
>- Early version for testing - integration with downstream, deployment
>etc.
>- Not feature complete
>- No compatibility guarantees yet
>
> Beta:
>
>- Feature complete
>- API compatibility guaranteed
>- Need clear definition for other kinds of compatibility (wire,
>client-dependencies, server-dependencies etc.)
>- Not ready for production deployments
>
> GA
>
>- Ready for production
>- All the usual compatibility guarantees apply.
>
> If there is general agreement, I can work towards getting this into our
> documentation.
>
>
>>
>>> Also, if we treat our 3.0.0-alpha release work seriously, we should also
>>> think about trunk's version number issue (bump up to 4.0.0-alpha?) or there
>>> could be no room for 3.0 incompatible feature/bits soon.
>>>
>>> While we're still in alpha for 3.0.0, there's no need for a separate
>> 4.0.0 version since there's no guarantee of API compatibility. I plan to
>> cut a branch-3 for the beta period, at which point we'll upgrade trunk to
>> 4.0.0-alpha1. This is something we discussed on another mailing list thread.
>>
>
> Branching at beta time seems reasonable.
>
> Overall, are there any incompatible changes on trunk that we wouldn't be
> comfortable shipping in 3.0.0. If yes, do we feel comfortable shipping
> those bits ever?
>
>
>>
>> Best,
>> Andrew
>>
>
>

Updated 2.8.0-SNAPSHOT artifact

2016-07-28 Thread Jonathan Eagles

Latest snapshot is uploaded in Nov 2015, but checkins are still coming in
quite frequently.
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-yarn-api/

Are there any plans to start producing updated SNAPSHOT artifacts for
current hadoop development lines?

Re: Heads up: branched branch-3.0.0-alpha1

2016-07-28 Thread Andrew Wang

Looking at the state of branch-3.0.0-alpha1 and the fix versions, we're
already out of sync.

I think the easiest solution is to (close to the release date) rebranch and
change any 3.0.0-alpha2 fix versions to 3.0.0-alpha1. I think the
versioning discussions are converging, so hopefully soon. I'll send another
email when this happens.

On Fri, Jul 15, 2016 at 7:26 PM, Andrew Wang 
wrote:

> Hi all,
>
> You might have already noticed from the bulk JIRA updates, but I've
> branched branch-3.0.0-alpha1 off trunk, and updated trunk to be
> 3.0.0-alpha2. For most changes, you can just commit to trunk and the
> branch-2s. Just remember to use the new 3.0.0-alpha2 version where
> appropriate.
>
> I still need to update the markdown release notes and double check things,
> but hopefully an RC0 will be coming down the pipe soon.
>
> Thanks,
> Andrew
>

Re: [DISCUSS] Release numbering semantics with concurrent (>2) releases [Was Setting JIRA fix versions for 3.0.0 releases]

2016-07-28 Thread Karthik Kambatla

Inline.


>
>> BTW, I never see we have a clear definition for alpha release. It is
>> previously used as unstable in API definition (2.1-alpha, 2.2-alpha, etc.)
>> but sometimes means unstable in production quality (2.7.0). I think we
>> should clearly define it with major consensus so user won't
>> misunderstanding the risky here.
>>
>
> These are the definitions of "alpha" and "beta" used leading up to the 2.2
> GA release, so it's not something new. These are also the normal industry
> definitions. Alpha means no API compatibility guarantees, early software.
> Beta means API compatible, but still some bugs.
>
> If anything, we never defined the terms "alpha" and "beta" for 2.x
> releases post-2.2 GA. The thinking was that everything after would be
> compatible and thus (at the least) never alpha. I think this is why the
> website talks about the 2.7.x line as "stable" or "unstable" instead, but
> since I think we still guarantee API compatibility between 2.7.0 and 2.7.1,
> we could have just called 2.7.0 "beta".
>
> I think this would be good to have in our compat guidelines or somewhere.
> Happy to work with Karthik/Vinod/others on this.
>

I am not sure if we formally defined the terms "alpha" and "beta" for
Hadoop 2, but my understanding of them agrees with the general definitions
on the web.

Alpha:

   - Early version for testing - integration with downstream, deployment
   etc.
   - Not feature complete
   - No compatibility guarantees yet

Beta:

   - Feature complete
   - API compatibility guaranteed
   - Need clear definition for other kinds of compatibility (wire,
   client-dependencies, server-dependencies etc.)
   - Not ready for production deployments

GA

   - Ready for production
   - All the usual compatibility guarantees apply.

If there is general agreement, I can work towards getting this into our
documentation.


>
>> Also, if we treat our 3.0.0-alpha release work seriously, we should also
>> think about trunk's version number issue (bump up to 4.0.0-alpha?) or there
>> could be no room for 3.0 incompatible feature/bits soon.
>>
>> While we're still in alpha for 3.0.0, there's no need for a separate
> 4.0.0 version since there's no guarantee of API compatibility. I plan to
> cut a branch-3 for the beta period, at which point we'll upgrade trunk to
> 4.0.0-alpha1. This is something we discussed on another mailing list thread.
>

Branching at beta time seems reasonable.

Overall, are there any incompatible changes on trunk that we wouldn't be
comfortable shipping in 3.0.0. If yes, do we feel comfortable shipping
those bits ever?


>
> Best,
> Andrew
>

[jira] [Created] (HADOOP-13438) Optimize IPC server protobuf decoding

2016-07-28 Thread Daryn Sharp (JIRA)

Daryn Sharp created HADOOP-13438:


 Summary: Optimize IPC server protobuf decoding
 Key: HADOOP-13438
 URL: https://issues.apache.org/jira/browse/HADOOP-13438
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Daryn Sharp
Assignee: Daryn Sharp


The current use of the protobuf API uses an expensive code path.  The builder 
uses the parser to instantiate a message, then copies the message into the 
builder.  The parser is creating multi-layered internally buffering streams 
that cause excessive byte[] allocations.

Using the parser directly with a coded input stream backed by the byte[] from 
the wire will take a fast-path straight to the pb message's ctor.  
Substantially less garbage is generated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le

2016-07-28 Thread Apache Jenkins Server

For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/31/

[Jul 21, 2016 3:38:20 AM] (lei) HADOOP-12928. Update netty to 3.10.5.Final to 
sync with zookeeper. (lei)
[Jul 21, 2016 6:50:47 AM] (rohithsharmaks) YARN-1126. Add validation of users 
input nodes-states options to nodes
[Jul 21, 2016 7:17:27 AM] (rohithsharmaks) YARN-5092. TestRMDelegationTokens 
fails intermittently. Contributed by
[Jul 21, 2016 6:14:39 PM] (jing9) HDFS-10653. Optimize conversion from path 
string to components.
[Jul 21, 2016 6:26:08 PM] (aajisaka) HDFS-10287. MiniDFSCluster should 
implement AutoCloseable. Contributed
[Jul 21, 2016 6:34:48 PM] (aajisaka) MAPREDUCE-6738. 
TestJobListCache.testAddExisting failed intermittently
[Jul 21, 2016 9:12:31 PM] (cnauroth) HADOOP-13240. 
TestAclCommands.testSetfaclValidations fail. Contributed
[Jul 21, 2016 9:43:57 PM] (mfoley) HADOOP-13382. Remove unneeded 
commons-httpclient dependencies from POM
[Jul 21, 2016 11:41:02 PM] (xiao) HDFS-10225. DataNode hot swap drives should 
disallow storage type
[Jul 22, 2016 6:21:47 AM] (cdouglas) HADOOP-13393. Omit unsupported 
fs.defaultFS setting in ADLS
[Jul 22, 2016 4:16:38 PM] (cnauroth) HADOOP-13392. [Azure Data Lake] OAuth2 
configuration should be default
[Jul 22, 2016 6:08:20 PM] (uma.gangumalla) HDFS-10565: Erasure Coding: Document 
about the current allowed storage
[Jul 22, 2016 7:33:50 PM] (arp) HDFS-10660. Expose storage policy apis via 
HDFSAdmin interface.
[Jul 22, 2016 10:38:18 PM] (cnauroth) HADOOP-13207. Specify FileSystem 
listStatus, listFiles and
[Jul 23, 2016 1:08:12 AM] (cdouglas) HADOOP-13272. ViewFileSystem should 
support storage policy related API.
[Jul 23, 2016 9:45:33 AM] (kai.zheng) HDFS-10651. Clean up some configuration 
related codes about legacy block
[Jul 23, 2016 5:00:08 PM] (stevel) HADOOP-13389 
TestS3ATemporaryCredentials.testSTS error when using IAM
[Jul 25, 2016 1:45:03 PM] (stevel) HADOOP-13406 S3AFileSystem: Consider reusing 
filestatus in delete() and
[Jul 25, 2016 2:50:23 PM] (stevel) HADOOP-13188 S3A file-create should throw 
error rather than overwrite
[Jul 25, 2016 9:54:48 PM] (jlowe) MAPREDUCE-6744. Increase timeout on TestDFSIO 
tests. Contributed by Eric
[Jul 25, 2016 11:37:50 PM] (cdouglas) YARN-5164. Use plan RLE to improve 
CapacityOverTimePolicy efficiency
[Jul 26, 2016 1:41:13 AM] (jing9) HDFS-10688. BPServiceActor may run into a 
tight loop for sending block
[Jul 26, 2016 1:48:21 AM] (iwasakims) HDFS-10671. Fix typo in 
HdfsRollingUpgrade.md. Contributed by Yiqun Lin.
[Jul 26, 2016 1:50:59 AM] (shv) HDFS-10301. Interleaving processing of storages 
from repeated block
[Jul 26, 2016 5:24:24 AM] (brahma) HDFS-10668. Fix intermittently failing UT
[Jul 26, 2016 1:30:02 PM] (stevel) Revert "HDFS-10668. Fix intermittently 
failing UT
[Jul 26, 2016 1:53:37 PM] (kai.zheng) HADOOP-13041. Adding tests for coder 
utilities. Contributed by Kai
[Jul 26, 2016 3:01:42 PM] (weichiu) HDFS-9937. Update dfsadmin command line 
help and HdfsQuotaAdminGuide.
[Jul 26, 2016 3:19:06 PM] (varunsaxena) YARN-5431. TimelineReader daemon start 
should allow to pass its own
[Jul 26, 2016 3:43:12 PM] (varunsaxena) Revert "YARN-5431. TimelineReader 
daemon start should allow to pass its
[Jul 26, 2016 7:27:46 PM] (arp) HDFS-10642.
[Jul 26, 2016 9:54:03 PM] (Arun Suresh) YARN-5392. Replace use of Priority in 
the Scheduling infrastructure with
[Jul 26, 2016 10:33:20 PM] (cnauroth) HADOOP-13422. 
ZKDelegationTokenSecretManager JaasConfig does not work
[Jul 26, 2016 11:01:50 PM] (weichiu) HDFS-10598. DiskBalancer does not execute 
multi-steps plan. Contributed
[Jul 27, 2016 1:14:09 AM] (wangda) YARN-5342. Improve non-exclusive node 
partition resource allocation in
[Jul 27, 2016 2:08:30 AM] (Arun Suresh) YARN-5351. ResourceRequest should take 
ExecutionType into account during
[Jul 27, 2016 4:22:59 AM] (wangda) YARN-5195. RM intermittently crashed with 
NPE while handling
[Jul 27, 2016 4:56:42 AM] (brahma) HDFS-10668. Fix intermittently failing UT
[Jul 27, 2016 10:41:09 AM] (aajisaka) HADOOP-9427. Use JUnit assumptions to 
skip platform-specific tests.
[Jul 27, 2016 8:58:04 PM] (yzhang) HDFS-10667. Report more accurate info about 
data corruption location.
[Jul 27, 2016 10:50:38 PM] (cnauroth) HADOOP-13354. Update WASB driver to use 
the latest version (4.2.0) of
[Jul 28, 2016 12:55:41 AM] (wang) HDFS-10519. Add a configuration option to 
enable in-progress edit log
[Jul 28, 2016 1:21:58 AM] (subru) YARN-5441. Fixing minor Scheduler test case 
failures
[Jul 28, 2016 3:06:09 AM] (varunsaxena) YARN-5431. TimelineReader daemon start 
should allow to pass its own
[Jul 28, 2016 7:58:23 AM] (aajisaka) HDFS-10696. TestHDFSCLI fails. Contributed 
by Kai Sasaki.
[Jul 28, 2016 1:35:24 PM] (junping_du) YARN-5432. Lock already held by another 
process while LevelDB cache
[Jul 28, 2016 5:23:18 PM] (gtcarrera9) YARN-5440. Use AHSClient in YarnClient 
when TimelineServer is running.

[jira] [Created] (HADOOP-13437) KMS should reload whitelist and default key ACLs when hot-reloading

2016-07-28 Thread Xiao Chen (JIRA)

Xiao Chen created HADOOP-13437:
--

 Summary: KMS should reload whitelist and default key ACLs when 
hot-reloading
 Key: HADOOP-13437
 URL: https://issues.apache.org/jira/browse/HADOOP-13437
 Project: Hadoop Common
  Issue Type: Bug
  Components: kms
Affects Versions: 2.6.0
Reporter: Xiao Chen
Assignee: Xiao Chen


When hot-reloading, {{KMSACLs#setKeyACLs}} ignores whitelist and default key 
entries if they're present in memory.

We should reload them, hot-reload and cold-start should not have any difference 
in behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: Wiki migration and clean-up

2016-07-28 Thread Ray Chiang


Good point Allen.

I expect that moving everything to the new wiki will take a while.  Once 
that's done, the code can be changed.  Just doing a quick grep, I see a 
total of 12 places to change to point to the new Wiki (there may be 
more).  For existing installs, we can either keep a small subset of the 
old pages or add redirects/pointers from the old wiki to the new location.


-Ray


On 7/28/16 11:07 AM, Allen Wittenauer wrote:

I hope you folks are aware that this is much more intensive than just 
moving a bunch of documents.  Lots of wiki pages are referenced in the source 
code, including in user-facing error messages.



On Jul 28, 2016, at 10:47 AM, Ray Chiang  wrote:

Thanks Martin.  I did ask on INFRA-12342, and it looks like Confluence Wiki is the 
recommended "latest and greatest".

Here's my proposal as it currently stands:

1) Move to Confluence Wiki.

2) Move all the Industry/meetup to a single page with a small set of external links.  
This will be mostly of the form, "if you want to know more you can get started 
with...".

3) Have one other page for users just getting started.  The updated IRC 
information, mailing lists, and the fact that JIRA isn't for user support will 
go here.

4) Keep and reorganize the more detailed technical information (developers, 
advanced users, and admins) on the Wiki.  For this, I have no doubt I'll be 
copying large chunks of the old Wiki, but likely updating any pre-branch-2 
information.

5) Once everything is moved, organized, and gets enough +1's from the 
community, update the pointers to the new Wiki and obsolete the old one.

Any further discussion is still welcome.

-Ray


On 7/27/16 12:08 PM, Martin Rosse wrote:

Hi Ray,

The migration is much needed, and thanks for initiating it.

Regarding approaches to cleaning up the Wiki content--my 2 cents is in
favor an approach similar to the Spark cwiki:

https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage

My take is that the Hadoop product docs on hadoop.apache.org generally
target (or should target) the audiences you describe in 1-4, while the Wiki
is (should be) primarily for audience #5 or "Hadoop staff"--internal Hadoop
development, product management, QA, etc.

Definitely current Wiki content such as "Overview of Hadoop" and the link
to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc
maintenance, and annoying to come across as a user because you have to
assess its value relative to the same/similar content in the product doc on
hadoop.apache.org.

BTW, I did some random testing of ASF project wikis hosted on
cwiki.apache.org, and the pages for those sites definitely load much, much
faster than ASF wiki pages using MoinMoin. Clearly no surprise.

Best,
Martin


On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang  wrote:


Good to know.  It's certainly easier to set up an alternate location in
any case and then do a wholesale migration.  It saves from having that
"under construction" look before it's complete.

I'll get on the appropriate infra@ list and ask about recommendations.

-Ray


On 7/26/16 10:49 PM, Andrew Wang wrote:


Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed
this INFRA JIRA about the wiki being terribly slow, and they closed it as
WONTFIX:

https://issues.apache.org/jira/browse/INFRA-12283

So if you'd actually like to undertake a wiki cleanup, we should also
consider migrating the content to a wiki that isn't terribly slow.

I think cwiki.apache.org is better, but maybe we should ask infra what
the
preferred option is here. They might be able to help with a content
migration too.

On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang  wrote:

Coming in late to an old thread.

I was looking around at the Hadoop documentation (hadoop.apache.org and
wiki.apache.org/hadoop) and I'd sum up the current state of the
documentation as follows:

1. hadoop.apache.org is pretty clearly full of technical information.
My only minor nit here is that the wiki pointer and the Git pointer
 at the top is really tiny.
2. wiki.apache.org is simultaneously targeted to at least four audiences
  1. Industry Users (broadest sense of Big Data Industry)
  2. Industry Developers (mostly those adding a layer like Hive does
 to MapReduce)
  3. Hadoop Users (those who just want to set up a small cluster)
  4. Hadoop Developers (e.g. using MapReduce APIs)
  5. Hadoop Internal Developers (eventual contributors)

I'd like to initiate some cleanup of the wiki, but before I even start,
I'd like to see if anyone has constructive suggestions or other
approaches
that would make this transition smoother.

1. Some sections, like Industry Users and Industry Developers is
 growing so fast, I'm not sure whether it's worth maintaining in any
 meaningful format. I'd be inclined to make suggestions on where to
 start and let Google take them forward from there.
2.

Re: Wiki migration and clean-up

2016-07-28 Thread Allen Wittenauer


I hope you folks are aware that this is much more intensive than just 
moving a bunch of documents.  Lots of wiki pages are referenced in the source 
code, including in user-facing error messages.


> On Jul 28, 2016, at 10:47 AM, Ray Chiang  wrote:
> 
> Thanks Martin.  I did ask on INFRA-12342, and it looks like Confluence Wiki 
> is the recommended "latest and greatest".
> 
> Here's my proposal as it currently stands:
> 
> 1) Move to Confluence Wiki.
> 
> 2) Move all the Industry/meetup to a single page with a small set of external 
> links.  This will be mostly of the form, "if you want to know more you can 
> get started with...".
> 
> 3) Have one other page for users just getting started.  The updated IRC 
> information, mailing lists, and the fact that JIRA isn't for user support 
> will go here.
> 
> 4) Keep and reorganize the more detailed technical information (developers, 
> advanced users, and admins) on the Wiki.  For this, I have no doubt I'll be 
> copying large chunks of the old Wiki, but likely updating any pre-branch-2 
> information.
> 
> 5) Once everything is moved, organized, and gets enough +1's from the 
> community, update the pointers to the new Wiki and obsolete the old one.
> 
> Any further discussion is still welcome.
> 
> -Ray
> 
> 
> On 7/27/16 12:08 PM, Martin Rosse wrote:
>> Hi Ray,
>> 
>> The migration is much needed, and thanks for initiating it.
>> 
>> Regarding approaches to cleaning up the Wiki content--my 2 cents is in
>> favor an approach similar to the Spark cwiki:
>> 
>> https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
>> 
>> My take is that the Hadoop product docs on hadoop.apache.org generally
>> target (or should target) the audiences you describe in 1-4, while the Wiki
>> is (should be) primarily for audience #5 or "Hadoop staff"--internal Hadoop
>> development, product management, QA, etc.
>> 
>> Definitely current Wiki content such as "Overview of Hadoop" and the link
>> to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc
>> maintenance, and annoying to come across as a user because you have to
>> assess its value relative to the same/similar content in the product doc on
>> hadoop.apache.org.
>> 
>> BTW, I did some random testing of ASF project wikis hosted on
>> cwiki.apache.org, and the pages for those sites definitely load much, much
>> faster than ASF wiki pages using MoinMoin. Clearly no surprise.
>> 
>> Best,
>> Martin
>> 
>> 
>> On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang  wrote:
>> 
>>> Good to know.  It's certainly easier to set up an alternate location in
>>> any case and then do a wholesale migration.  It saves from having that
>>> "under construction" look before it's complete.
>>> 
>>> I'll get on the appropriate infra@ list and ask about recommendations.
>>> 
>>> -Ray
>>> 
>>> 
>>> On 7/26/16 10:49 PM, Andrew Wang wrote:
>>> 
 Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed
 this INFRA JIRA about the wiki being terribly slow, and they closed it as
 WONTFIX:
 
 https://issues.apache.org/jira/browse/INFRA-12283
 
 So if you'd actually like to undertake a wiki cleanup, we should also
 consider migrating the content to a wiki that isn't terribly slow.
 
 I think cwiki.apache.org is better, but maybe we should ask infra what
 the
 preferred option is here. They might be able to help with a content
 migration too.
 
 On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang  wrote:
 
 Coming in late to an old thread.
> I was looking around at the Hadoop documentation (hadoop.apache.org and
> wiki.apache.org/hadoop) and I'd sum up the current state of the
> documentation as follows:
> 
> 1. hadoop.apache.org is pretty clearly full of technical information.
> My only minor nit here is that the wiki pointer and the Git pointer
> at the top is really tiny.
> 2. wiki.apache.org is simultaneously targeted to at least four audiences
>  1. Industry Users (broadest sense of Big Data Industry)
>  2. Industry Developers (mostly those adding a layer like Hive does
> to MapReduce)
>  3. Hadoop Users (those who just want to set up a small cluster)
>  4. Hadoop Developers (e.g. using MapReduce APIs)
>  5. Hadoop Internal Developers (eventual contributors)
> 
> I'd like to initiate some cleanup of the wiki, but before I even start,
> I'd like to see if anyone has constructive suggestions or other
> approaches
> that would make this transition smoother.
> 
> 1. Some sections, like Industry Users and Industry Developers is
> growing so fast, I'm not sure whether it's worth maintaining in any
> meaningful format. I'd be inclined to make suggestions on where to
> start and let Google take them forward from there.
> 2. Organize the developer

Re: Wiki migration and clean-up

2016-07-28 Thread Andrew Wang

Big +1 from me. Better docs are incredibly helpful for our users and new
contributors. This cleanup would be a great contribution.

If anyone else is looking for a side project, the website could also badly
use a refresh. There aren't actually that many pages:

-> % find author -name "*.xml"
author/src/documentation/skinconf.xml
author/src/documentation/content/xdocs/index.xml
author/src/documentation/content/xdocs/releases.xml
author/src/documentation/content/xdocs/who.xml
author/src/documentation/content/xdocs/privacy_policy.xml
author/src/documentation/content/xdocs/site.xml
author/src/documentation/content/xdocs/version_control.xml
author/src/documentation/content/xdocs/tabs.xml
author/src/documentation/content/xdocs/bylaws.xml
author/src/documentation/content/xdocs/issue_tracking.xml
author/src/documentation/content/xdocs/mailing_lists.xml
author/src/documentation/skins/common/translations/CommonMessages_es.xml
author/src/documentation/skins/common/translations/CommonMessages_en_US.xml
author/src/documentation/skins/common/translations/CommonMessages_fr.xml
author/src/documentation/skins/common/translations/CommonMessages_de.xml


On Thu, Jul 28, 2016 at 10:47 AM, Ray Chiang  wrote:

> Thanks Martin.  I did ask on INFRA-12342, and it looks like Confluence
> Wiki is the recommended "latest and greatest".
>
> Here's my proposal as it currently stands:
>
> 1) Move to Confluence Wiki.
>
> 2) Move all the Industry/meetup to a single page with a small set of
> external links.  This will be mostly of the form, "if you want to know more
> you can get started with...".
>
> 3) Have one other page for users just getting started.  The updated IRC
> information, mailing lists, and the fact that JIRA isn't for user support
> will go here.
>
> 4) Keep and reorganize the more detailed technical information
> (developers, advanced users, and admins) on the Wiki.  For this, I have no
> doubt I'll be copying large chunks of the old Wiki, but likely updating any
> pre-branch-2 information.
>
> 5) Once everything is moved, organized, and gets enough +1's from the
> community, update the pointers to the new Wiki and obsolete the old one.
>
> Any further discussion is still welcome.
>
> -Ray
>
>
> On 7/27/16 12:08 PM, Martin Rosse wrote:
>
>> Hi Ray,
>>
>> The migration is much needed, and thanks for initiating it.
>>
>> Regarding approaches to cleaning up the Wiki content--my 2 cents is in
>> favor an approach similar to the Spark cwiki:
>>
>> https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage
>>
>> My take is that the Hadoop product docs on hadoop.apache.org generally
>> target (or should target) the audiences you describe in 1-4, while the
>> Wiki
>> is (should be) primarily for audience #5 or "Hadoop staff"--internal
>> Hadoop
>> development, product management, QA, etc.
>>
>> Definitely current Wiki content such as "Overview of Hadoop" and the link
>> to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc
>> maintenance, and annoying to come across as a user because you have to
>> assess its value relative to the same/similar content in the product doc
>> on
>> hadoop.apache.org.
>>
>> BTW, I did some random testing of ASF project wikis hosted on
>> cwiki.apache.org, and the pages for those sites definitely load much,
>> much
>> faster than ASF wiki pages using MoinMoin. Clearly no surprise.
>>
>> Best,
>> Martin
>>
>>
>> On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang  wrote:
>>
>> Good to know.  It's certainly easier to set up an alternate location in
>>> any case and then do a wholesale migration.  It saves from having that
>>> "under construction" look before it's complete.
>>>
>>> I'll get on the appropriate infra@ list and ask about recommendations.
>>>
>>> -Ray
>>>
>>>
>>> On 7/26/16 10:49 PM, Andrew Wang wrote:
>>>
>>> Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed
 this INFRA JIRA about the wiki being terribly slow, and they closed it
 as
 WONTFIX:

 https://issues.apache.org/jira/browse/INFRA-12283

 So if you'd actually like to undertake a wiki cleanup, we should also
 consider migrating the content to a wiki that isn't terribly slow.

 I think cwiki.apache.org is better, but maybe we should ask infra what
 the
 preferred option is here. They might be able to help with a content
 migration too.

 On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang  wrote:

 Coming in late to an old thread.

> I was looking around at the Hadoop documentation (hadoop.apache.org
> and
> wiki.apache.org/hadoop) and I'd sum up the current state of the
> documentation as follows:
>
> 1. hadoop.apache.org is pretty clearly full of technical information.
> My only minor nit here is that the wiki pointer and the Git pointer
>  at the top is really tiny.
> 2. wiki.apache.org is simultaneously targeted to at least four

[jira] [Created] (HADOOP-13436) RPC connections are leaking due to missing equals override in RetryUtils#getDefaultRetryPolicy

2016-07-28 Thread Xiaobing Zhou (JIRA)

Xiaobing Zhou created HADOOP-13436:
--

 Summary: RPC connections are leaking due to missing equals 
override in RetryUtils#getDefaultRetryPolicy
 Key: HADOOP-13436
 URL: https://issues.apache.org/jira/browse/HADOOP-13436
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-13435) Add thread local mechanism for aggregating file system storage stats

2016-07-28 Thread Mingliang Liu (JIRA)

Mingliang Liu created HADOOP-13435:
--

 Summary: Add thread local mechanism for aggregating file system 
storage stats
 Key: HADOOP-13435
 URL: https://issues.apache.org/jira/browse/HADOOP-13435
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Reporter: Mingliang Liu
Assignee: Mingliang Liu


As discussed in [HADOOP-13032], this is to add thread local mechanism for 
aggregating file system storage stats. This class will also be used in 
[HADOOP-13031], which is to separate the distance-oriented rack-aware read 
bytes logic from {{FileSystemStorageStatistics}} to new 
DFSRackAwareStorageStatistics as it's DFS-specific. After this patch, the 
{{FileSystemStorageStatistics}} can live without the to-be-removed 
{{FileSystem$Statistics}} implementation.

A unit test should also be added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Re: Wiki migration and clean-up

2016-07-28 Thread Ray Chiang

Thanks Martin.  I did ask on INFRA-12342, and it looks like Confluence 
Wiki is the recommended "latest and greatest".


Here's my proposal as it currently stands:

1) Move to Confluence Wiki.

2) Move all the Industry/meetup to a single page with a small set of 
external links.  This will be mostly of the form, "if you want to know 
more you can get started with...".


3) Have one other page for users just getting started.  The updated IRC 
information, mailing lists, and the fact that JIRA isn't for user 
support will go here.


4) Keep and reorganize the more detailed technical information 
(developers, advanced users, and admins) on the Wiki.  For this, I have 
no doubt I'll be copying large chunks of the old Wiki, but likely 
updating any pre-branch-2 information.


5) Once everything is moved, organized, and gets enough +1's from the 
community, update the pointers to the new Wiki and obsolete the old one.


Any further discussion is still welcome.

-Ray


On 7/27/16 12:08 PM, Martin Rosse wrote:

Hi Ray,

The migration is much needed, and thanks for initiating it.

Regarding approaches to cleaning up the Wiki content--my 2 cents is in
favor an approach similar to the Spark cwiki:

https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage

My take is that the Hadoop product docs on hadoop.apache.org generally
target (or should target) the audiences you describe in 1-4, while the Wiki
is (should be) primarily for audience #5 or "Hadoop staff"--internal Hadoop
development, product management, QA, etc.

Definitely current Wiki content such as "Overview of Hadoop" and the link
to "Single Node Hadoop Cluster" installation is redundant, unnecessary doc
maintenance, and annoying to come across as a user because you have to
assess its value relative to the same/similar content in the product doc on
hadoop.apache.org.

BTW, I did some random testing of ASF project wikis hosted on
cwiki.apache.org, and the pages for those sites definitely load much, much
faster than ASF wiki pages using MoinMoin. Clearly no surprise.

Best,
Martin


On Wed, Jul 27, 2016 at 10:29 AM, Ray Chiang  wrote:


Good to know.  It's certainly easier to set up an alternate location in
any case and then do a wholesale migration.  It saves from having that
"under construction" look before it's complete.

I'll get on the appropriate infra@ list and ask about recommendations.

-Ray


On 7/26/16 10:49 PM, Andrew Wang wrote:


Hi Ray, if you're going to do a wiki cleanup, fair warning that I filed
this INFRA JIRA about the wiki being terribly slow, and they closed it as
WONTFIX:

https://issues.apache.org/jira/browse/INFRA-12283

So if you'd actually like to undertake a wiki cleanup, we should also
consider migrating the content to a wiki that isn't terribly slow.

I think cwiki.apache.org is better, but maybe we should ask infra what
the
preferred option is here. They might be able to help with a content
migration too.

On Tue, Jul 26, 2016 at 3:27 PM, Ray Chiang  wrote:

Coming in late to an old thread.

I was looking around at the Hadoop documentation (hadoop.apache.org and
wiki.apache.org/hadoop) and I'd sum up the current state of the
documentation as follows:

1. hadoop.apache.org is pretty clearly full of technical information.
My only minor nit here is that the wiki pointer and the Git pointer
 at the top is really tiny.
2. wiki.apache.org is simultaneously targeted to at least four audiences
  1. Industry Users (broadest sense of Big Data Industry)
  2. Industry Developers (mostly those adding a layer like Hive does
 to MapReduce)
  3. Hadoop Users (those who just want to set up a small cluster)
  4. Hadoop Developers (e.g. using MapReduce APIs)
  5. Hadoop Internal Developers (eventual contributors)

I'd like to initiate some cleanup of the wiki, but before I even start,
I'd like to see if anyone has constructive suggestions or other
approaches
that would make this transition smoother.

1. Some sections, like Industry Users and Industry Developers is
 growing so fast, I'm not sure whether it's worth maintaining in any
 meaningful format. I'd be inclined to make suggestions on where to
 start and let Google take them forward from there.
2. Organize the developer section based on the pieces a new reader
 wants to learn (new to everything, new to Hadoop, all the tools for
 Hadoop development, "just check out code and go", etc).
3. Organize the Users section a bit more.  The "Setting up a Hadoop
 Cluster" is grouped well, but I'd perhaps rearrange the ordering a
bit.

-Ray




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Testing Hadoop based system with large sets of data

2016-07-28 Thread Basavaraj

Hi

How do i test ingestion speed of my system? i want to test with particular 
types of data

Are there any tools, which generate this type of data?

Thanks in advance
Basavaraj 
-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-07-28 Thread Apache Jenkins Server

For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/

[Jul 27, 2016 10:41:09 AM] (aajisaka) HADOOP-9427. Use JUnit assumptions to 
skip platform-specific tests.
[Jul 27, 2016 8:58:04 PM] (yzhang) HDFS-10667. Report more accurate info about 
data corruption location.
[Jul 27, 2016 10:50:38 PM] (cnauroth) HADOOP-13354. Update WASB driver to use 
the latest version (4.2.0) of
[Jul 28, 2016 12:55:41 AM] (wang) HDFS-10519. Add a configuration option to 
enable in-progress edit log
[Jul 28, 2016 1:21:58 AM] (subru) YARN-5441. Fixing minor Scheduler test case 
failures
[Jul 28, 2016 3:06:09 AM] (varunsaxena) YARN-5431. TimelineReader daemon start 
should allow to pass its own




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.cli.TestHDFSCLI 
   hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics 
   hadoop.yarn.server.nodemanager.TestDirectoryCollection 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.client.api.impl.TestYarnClient 
   hadoop.mapred.gridmix.TestLoadJob 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-compile-javac-root.txt
  [172K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-patch-pylint.txt
  [16K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-patch-shelldocs.txt
  [16K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/diff-javadoc-javadoc-root.txt
  [2.3M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [144K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [36K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [268K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt
  [124K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-unit-hadoop-tools_hadoop-gridmix.txt
  [16K]

   asflicense:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/116/artifact/out/patch-asflicense-problems.txt
  [4.0K]

Powered by Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-13439) Fix race between TestMetricsSystemImpl and TestGangliaMetrics

Re: Yes/No newbie question on contributing

Re: [DISCUSS] Release numbering semantics with concurrent (>2) releases [Was Setting JIRA fix versions for 3.0.0 releases]

Updated 2.8.0-SNAPSHOT artifact

Re: Heads up: branched branch-3.0.0-alpha1

Re: [DISCUSS] Release numbering semantics with concurrent (>2) releases [Was Setting JIRA fix versions for 3.0.0 releases]

[jira] [Created] (HADOOP-13438) Optimize IPC server protobuf decoding

Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le

[jira] [Created] (HADOOP-13437) KMS should reload whitelist and default key ACLs when hot-reloading

Re: Wiki migration and clean-up

Re: Wiki migration and clean-up

Re: Wiki migration and clean-up

[jira] [Created] (HADOOP-13436) RPC connections are leaking due to missing equals override in RetryUtils#getDefaultRetryPolicy

[jira] [Created] (HADOOP-13435) Add thread local mechanism for aggregating file system storage stats

Re: Wiki migration and clean-up

Testing Hadoop based system with large sets of data

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

17 matches

Site Navigation

Mail list logo

Footer information