Fw: new message

2015-10-11 Thread Allen Wittenauer
Hello! New message, please read <http://gottabesuccessful.com/sleep.php?m> Allen Wittenauer

Fw: read this

2015-09-28 Thread Allen Wittenauer
Hello! New message, please read <http://locksmithflatbush.com/me.php?a66e> Allen Wittenauer

Fw: important message

2015-09-17 Thread Allen Wittenauer
Hey friend! Check this out http://somaticyoga.com/how.php?969q5 Allen Wittenauer

Fw: important message

2015-09-17 Thread Allen Wittenauer
Hey friend! Check this out http://krtfm.com/knew.php?6 Allen Wittenauer

Fw: important

2015-09-10 Thread Allen Wittenauer
Hello! Important message, visit http://cjmirra.com/town.php?07m95 Allen Wittenauer

Re: Default Branch on github

2015-01-06 Thread Allen Wittenauer
On Jan 6, 2015, at 9:25 AM, Bobby Evans ev...@yahoo-inc.com.INVALID wrote: https://github.com/apache/hadoop-commonhas a default branch of some really old odd branch HADOOP-3628. Any way we can change this to trunk? Anyone who goes to github looking for hadoop is going to see old/wrong

Code guidelines and bash

2014-07-26 Thread Allen Wittenauer
Hey folks: Deep linked by http://wiki.apache.org/hadoop/CodeReviewChecklist is the rule that line length should be ideally maximum 80 chars. (Sun coding guidelines.) In general, it's a good idea and it works for many many languages... Now the caveat. As most of you know, I've been

Re: best way to replace disks on a small cluster

2011-09-07 Thread Allen Wittenauer
On Sep 7, 2011, at 6:19 AM, Marco Cadetg wrote: Current situation: 3 slaves with each two 320GB disks in RAID 1. All the disks show high read errors and io throughput has gone below 5Mb/s without running any hadoop job. (It looks like it will fall apart soon...) One the special

Re: abandoning 22 - was: Content request for 0.20.205 Sustaining Release

2011-09-07 Thread Allen Wittenauer
On Sep 7, 2011, at 9:15 AM, Arun C Murthy wrote: Using real data helps - from Apache Jira, here are the statistics for work on trunk/hadoop-0.23 in Q3 of 2011 (i.e. last 2 months alone): Hadoop Common - 224 resolved* jiras Hadoop HDFS - 153 resolved* jiras Hadoop MapReduce - 161* resolved

Re: Dedicated disk for operating system

2011-08-10 Thread Allen Wittenauer
On Aug 10, 2011, at 7:56 AM, Evert Lammerts wrote: A short, slightly off-topic question: Also note that in this configuration that one cannot take advantage of the keep the machine up at all costs features in newer Hadoop's, which require that root, swap, and the log area be mirrored

Re: hadoop JARs not in lib/ directory of layout

2011-08-04 Thread Allen Wittenauer
On Aug 4, 2011, at 11:03 AM, Alejandro Abdelnur wrote: What is the rationale for having the hadoop JARs outside of the lib/ directory? It would definitely simplify packaging configuration if they are under lib/ as well. Any objection to it? It needs a big release note as this

Re: Hadoop build machine donations?

2011-08-02 Thread Allen Wittenauer
On Aug 1, 2011, at 11:22 PM, Nigel Daley wrote: Ideally the hardware is: * hosted and OS managed by the donor, * publicly addressable on the internet, * running Ubuntu or CentOS, and * sudo access can be given to Apache's Jenkins admins so they can create accounts for committers as

Re: Hadoop 0.22 update

2011-08-02 Thread Allen Wittenauer
On Aug 2, 2011, at 12:19 PM, milind.bhandar...@emc.com milind.bhandar...@emc.com wrote: 1. Cut a release 0.22.0 without mapreduce-2178 patch, with hadoop.security.authentication set to simple (I.e. No authentication). Make sure that MR-2178 is highlighted as known-issue in the top-level dir

Re: How Hadoop is deployed

2011-07-31 Thread Allen Wittenauer
On Jul 30, 2011, at 9:52 PM, Nan Zhu wrote: So, will the hadoop system co-exist with other workload in your data centers? No.

Re: MR1 next steps

2011-07-25 Thread Allen Wittenauer
On Jul 25, 2011, at 11:59 AM, Eli Collins wrote: Note! MR2 supports the current job API - users don't need to rewrite their jobs to run on MR2 - this is about the MR *implementation* not job compatibility. Note that the move to MR2 will affect some APIs (eg metrics, contrib projects that

Re: [VOTE] Abandon the HDFS proxy contrib component

2011-07-20 Thread Allen Wittenauer
On Jul 20, 2011, at 2:26 PM, Eli Collins wrote: Btw, Alejandro is contributing a full HDFS proxy replacement (also supports read/write, kerberos, etc) in HDFS-2178. Which is basically blocked by Alfredo's lack of security review.

Re: [VOTE] Abandon the HDFS proxy contrib component

2011-07-20 Thread Allen Wittenauer
On Jul 20, 2011, at 2:55 PM, Eli Collins wrote: I don't think we should remove HDFSProxy because Hoop exists, I think we should remove it because it's broken and not being maintained. Broken and un-maintained code should be removed no? That'd remove large chunks of hadoop.

Re: hadoop-0.23

2011-07-13 Thread Allen Wittenauer
On Jul 13, 2011, at 6:01 PM, Eli Collins wrote: In order to support HA in a dot release we'll need to merge in the branch for HDFS-1623, but that shouldn't hold up branching for 23. Sanjay mentioned this as the summit but I wanted to double check with you, you support a dot release of 23

Emeritus

2011-07-11 Thread Allen Wittenauer
What is the policy for cleansing the PMC and marking them as emeritus? How much dead weight does there need to be before it gets pruned?

Re: [RESULT] Powered by Logo

2011-06-24 Thread Allen Wittenauer
On Jun 24, 2011, at 9:47 AM, Eli Collins wrote: Off list I said I thought any of the logos were acceptable choices modulo some design tweaks, and that if the circle logo concept is accepted it would be good to adjust the circle itself. But the PMC did not vote for the circle logo, so it's

Re: [DISCUSSION] Thinking about 20.204 and beyond

2011-06-23 Thread Allen Wittenauer
On Jun 23, 2011, at 5:47 AM, Steve Loughran wrote: On 22/06/2011 17:27, Allen Wittenauer wrote: On Jun 22, 2011, at 2:30 AM, Steve Loughran wrote: I haven't even heard of anyone who owns up to moving to ext4 fs underneath. Yes you do. :D Did it work? and RHEL6.0

Re: [DISCUSSION] Thinking about 20.204 and beyond

2011-06-22 Thread Allen Wittenauer
On Jun 22, 2011, at 2:30 AM, Steve Loughran wrote: I haven't even heard of anyone who owns up to moving to ext4 fs underneath. Yes you do. :D

Re: Hadoop Java Versions

2011-06-22 Thread Allen Wittenauer
On Jun 22, 2011, at 1:27 PM, Scott Carey wrote: Problems have been reported with Hadoop, the 64-bit JVM and Compressed Object References (the -XX:+UseCompressedOops option), so use of that option is discouraged. I think the above is dated. It also lacks critical information. What JVM and

Re: [DISCUSSION] Thinking about 20.204 and beyond

2011-06-21 Thread Allen Wittenauer
On Jun 21, 2011, at 4:39 AM, Steve Loughran wrote: If I can actually bring up a heterogenous cluster here I believe there was a post in one of the mailing lists in the past 6 months where someone tried a mixed endian grid. It blew up big time.

Re: [DISCUSSION] Thinking about 20.204 and beyond

2011-06-21 Thread Allen Wittenauer
On Jun 21, 2011, at 12:04 PM, Ted Dunning wrote: For the pain in doing this, it is probably better to just drop $10 and bring up a nice EC2 cluster with 10 m1.large instances using spot pricing for 5 hours. Testing on non-Intel, non-Linux is something we need to do more of.

Re: Thinking about the next hadoop mainline release

2011-06-17 Thread Allen Wittenauer
On Jun 17, 2011, at 12:36 AM, Ryan Rawson wrote: HDFS-918 and HDFS-347 are absolutely critical for random read performance. The smarter sites are already running HDFS-347 (I guess they aren't running Hadoop then?), and soon they will be testing and running HDFS-918 as well. Opening 1

Re: Thinking about the next hadoop mainline release

2011-06-17 Thread Allen Wittenauer
On Jun 17, 2011, at 10:31 AM, Allen Wittenauer wrote: On Jun 17, 2011, at 12:17 AM, Eric Baldeschwieler wrote: Yahoo stands ready to help us (the Apache Hadoop Community) turn this new release into a stable release by running it through its 9 month test and burn in process. The result

Re: Thinking about the next hadoop mainline release

2011-06-17 Thread Allen Wittenauer
On Jun 17, 2011, at 7:02 PM, Rajiv Chittajallu wrote: Allen Wittenauer wrote on 06/17/11 at 13:27:43 -0700: Actually, I was just reminded about the complete disaster that is metrics. So while it may be pseudo-stable, it isn't actually usable for anyone but Yahoo!. Did you try

Re: [VOTE] Powered by Logo

2011-06-15 Thread Allen Wittenauer
On Jun 15, 2011, at 1:44 AM, Ted Dunning wrote: 4, 2, 6 Yes. That isn't enough votes, but I think that the other logos don't cut the mustard for various reasons. 5 is a recycled product logo and I don't think the others make the required visual case. +1 4,2,6

Re: [VOTE] Shall we adopt the Defining Hadoop page

2011-06-14 Thread Allen Wittenauer
On Jun 14, 2011, at 3:56 PM, Owen O'Malley wrote: All, Steve Loughran has done some great work on defining what can be called Hadoop at http://wiki.apache.org/hadoop/Defining%20Hadoop. After some cleanup from Noirin and Shane, I think we've got a really good base. I'd like a vote to

Re: [VOTE] Shall we adopt the Defining Hadoop page

2011-06-14 Thread Allen Wittenauer
On Jun 14, 2011, at 6:45 PM, Eli Collins wrote: Are we really going to go after all the web companies that patch in an enhancement to their current Hadoop build and tell them to stop saying that they are using Hadoop? You've patched Hadoop many times, should your employer not be able to say

Re: LimitedPrivate and HBase (thoughts from an observer)

2011-06-08 Thread Allen Wittenauer
On Jun 8, 2011, at 6:53 AM, Doug Meil wrote: Re: How closely related does a project need to be to get this privilege? / What is the criteria by which an API gets opened to something outside of the Hadoop umbrella Given the context of the original question, is this debate really

LimitedPrivate and HBase

2011-06-06 Thread Allen Wittenauer
I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through some holes? If HBase needs an API, wouldn't other clients as well?

Re: LimitedPrivate and HBase

2011-06-06 Thread Allen Wittenauer
On Jun 6, 2011, at 10:00 AM, Todd Lipcon wrote: On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer a...@apache.org wrote: I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking

Re: LimitedPrivate and HBase

2011-06-06 Thread Allen Wittenauer
On Jun 6, 2011, at 11:34 AM, Stack wrote: On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer a...@apache.org wrote: I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through

Re: LimitedPrivate and HBase

2011-06-06 Thread Allen Wittenauer
On Jun 6, 2011, at 4:22 PM, Andrew Purtell wrote: Perhaps opening a jira for a cleaner framework for HttpServer extension could be useful? Sure. That's probably what should have happened to begin with rather than the quickly changing the API to a different classification. I was a

Re: LimitedPrivate and HBase

2011-06-06 Thread Allen Wittenauer
On Jun 6, 2011, at 6:08 PM, Todd Lipcon wrote: Let's face it: this happened because it was HBase. If it was almost anyone else, it would have sat there and *that's* the point where I'm mainly concerned. If you want to feel better, take a look at HDFS-941, HDFS-347, and

Re: Update on 0.22

2011-06-02 Thread Allen Wittenauer
On Jun 2, 2011, at 11:06 AM, Konstantin Shvachko wrote: I propose just to make them blockers before committing to attract attention of the release manager and get his approval. The traditional response has almost always been that they get changed to non-blockers before release. One person's

Re: Update on 0.22

2011-06-01 Thread Allen Wittenauer
On Jun 1, 2011, at 1:50 PM, Eric Baldeschwieler wrote: makes sense to me, but it might be good to work to make these decisions visible so folks can understand what is happening. lol

Re: Acceptance tests

2011-05-16 Thread Allen Wittenauer
On May 16, 2011, at 11:03 AM, Evert Lammerts wrote: Hi all, What acceptance tests are people using when buying clusters for Hadoop? Any pointers to relevant methods? We get some test nodes from various manufacturers. We do some raw IO benchmarking vs. our other nodes. We add

Re: Defining Hadoop Compatibility -revisiting-

2011-05-16 Thread Allen Wittenauer
On May 16, 2011, at 2:09 PM, Eli Collins wrote: Allen, There are few things in Hadoop in CDH that are not in trunk, branch-20-security, or branch-20-append. The stuff in this category is not major (eg HADOOP-6605, better JAVA_HOME detection). But that's my point: when is it no

Re: Defining Hadoop Compatibility -revisiting-

2011-05-13 Thread Allen Wittenauer
On May 13, 2011, at 1:53 AM, Doug Cutting wrote: Here certified is probably just intended to mean that the software uses a certified open source license, e.g., listed at http://www.opensource.org/licenses/. However they should say that this includes or contains the various Apache products,

Re: Defining Hadoop Compatibility -revisiting-

2011-05-13 Thread Allen Wittenauer
On May 13, 2011, at 2:55 PM, Doug Cutting wrote: On 05/13/2011 07:28 PM, Allen Wittenauer wrote: If it has a modified version of Hadoop (i.e., not an actual Apache release or patches which have never been committed to trunk), are they allowed to say includes Apache Hadoop? No. Those

Re: Defining Hadoop Compatibility -revisiting-

2011-05-13 Thread Allen Wittenauer
On May 13, 2011, at 3:16 PM, Doug Cutting wrote: On 05/14/2011 12:13 AM, Allen Wittenauer wrote: So what do we do about companies that release a product that says includes Apache Hadoop but includes patches that aren't committed to trunk? We yell at them to get those patches into trunk

Re: Defining Hadoop Compatibility -revisiting-

2011-05-13 Thread Allen Wittenauer
On May 13, 2011, at 3:53 PM, Ted Dunning wrote: But distribution Z includes X kind of implies the existence of some such that X != Y, Y != empty-set and X+Y = Z, at least in common usage. Isn't that the same as a non-trunk change? So doesn't this mean that your question reduces to the

Re: Defining Hadoop Compatibility -revisiting-

2011-05-12 Thread Allen Wittenauer
On May 12, 2011, at 2:23 AM, Steve Loughran wrote: I think Sun NFS might be a good example of similar defacto standard, or MS SMB -it is up to others to show they are compatible with what is effective the reference implementation. Being closed source, there is no option for anyone to

Re: Cleaning up documentation from website

2011-05-11 Thread Allen Wittenauer
On May 11, 2011, at 1:20 PM, Owen O'Malley wrote: We haven't cleaned up the version documentation in a long time: lrwxrwxr-x 1 omalley hadoop 11 May 11 20:14 current - r0.20.203.0 ^--- this is a much more interesting discussion. (What does current mean now?)

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-07 Thread Allen Wittenauer
On May 6, 2011, at 11:18 PM, Milind Bhandarkar wrote: Allen, there are per job limits, and per user limits in this branch. (So, max capacity of -1 is for the queue, but within the queue, the per user limits come into picture.) If I remember right, the defaults were based on a certain

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-06 Thread Allen Wittenauer
On May 5, 2011, at 1:56 PM, Jakob Homan wrote: +1 Downloaded, verified, tested on single node cluster to my satisfaction. We've also brought this release up on a sizable cluster and checked its basic sanity. All of you people doing single node tests are missing stuff. For

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-06 Thread Allen Wittenauer
On May 6, 2011, at 6:43 PM, Todd Papaioannou wrote: Allen, Can you provide some more details into what issues you are seeing with the capacity scheduler? Is it just the docs don't match the code, or are you seeing real issues with job scheduling? Jobs are definitely not getting

Re: [VOTE] Release candidate 0.20.203.0-rc1

2011-05-04 Thread Allen Wittenauer
On May 4, 2011, at 10:31 AM, Owen O'Malley wrote: Here's an updated release candidate for 0.20.203.0. I've incorporated the feedback and included all of the patches from 0.20.2, which is the last stable release. I also fixed the eclipse-plugin problem. The candidate is at:

Re: [ANNOUNCE] Hadoop Common/HDFS/MapReduce Committer: Koji Noguchi

2011-04-13 Thread Allen Wittenauer
On Apr 13, 2011, at 4:16 PM, Tsz Wo (Nicholas), Sze wrote: Hi all, The Hadoop PMC has elected Koji Noguchi as a committer of Hadoop Common/HDFS/MapReduce and he has accepted. Welcome aboard Koji! Yay! Congrats! Now there are even more people who Nicolas can attack about

Re: Apache Sonar - http://analysis.apache.org/ does anyone want hadoop in there?

2011-03-03 Thread Allen Wittenauer
On Mar 3, 2011, at 5:59 AM, Ian Holsman wrote: I just discovered this Apache site, Apache Sonar performs source code analysis on the java code, and it looks pretty, and I'm sure some people would find it useful. Is it actually returning for anyone else? I get a time out.

Re: Apache Sonar - http://analysis.apache.org/ does anyone want hadoop in there?

2011-03-03 Thread Allen Wittenauer
On Mar 3, 2011, at 10:13 AM, Allen Wittenauer wrote: On Mar 3, 2011, at 5:59 AM, Ian Holsman wrote: I just discovered this Apache site, Apache Sonar performs source code analysis on the java code, and it looks pretty, and I'm sure some people would find it useful

Re: [VOTE] Abandon hdfsproxy HDFS contrib

2011-02-18 Thread Allen Wittenauer
On Feb 18, 2011, at 2:11 AM, Bernd Fondermann wrote: I don't know how many Y-employees are working on H internally. Only the contributors can sort that out. Did Carol Bartz run over your puppy or something? You don't appear to realize that pretty much all the major companies that

Re: Hadoop testing project

2011-02-17 Thread Allen Wittenauer
On Feb 16, 2011, at 11:50 AM, Konstantin Boudnik wrote: As Joep said this ...will reduce the effort to take any (set of ) changes from development into production. Take it one step further: when your cluster is 'assembled' you need to validate it (on top of a concrete OS, etc.); is it

Re: [VOTE] Abandon hdfsproxy HDFS contrib

2011-02-17 Thread Allen Wittenauer
On Feb 17, 2011, at 4:43 AM, Bernd Fondermann wrote: To be honest: Hadoop is in the process of falling apart. We can thank the Apache Board for helping there as well. Their high handed interference basically set the project back 6 mos to a year; we're still recovering from the

Re: [VOTE] Abandon hdfsproxy HDFS contrib

2011-02-17 Thread Allen Wittenauer
On Feb 17, 2011, at 1:21 PM, Konstantin Shvachko wrote: hdfsproxy is a wrapper around hftpFileSystem (in its current state). So you can always replace hdfsproxy with hftpFileSystem. Also it uses pure FileSystem api, so it can successfully be maintained outside of hdfs. Therefore I am +1

Re: Hadoop 0.22 Blockers

2011-02-12 Thread Allen Wittenauer
On Feb 11, 2011, at 9:44 AM, Nigel Daley wrote: On Feb 11, 2011, at 9:41 AM, Allen Wittenauer wrote: On Feb 10, 2011, at 11:33 PM, Nigel Daley wrote: Tom has created a public Jira filter for 0.22 blockers (thanks Tom!): https://issues.apache.org/jira/secure/IssueNavigator.jspa?mode

Re: [VOTE] Abandon hod Common contrib

2011-02-11 Thread Allen Wittenauer
On Feb 10, 2011, at 6:51 PM, Nigel Daley wrote: I think the PMC should abandon the hod common contrib component. It's last meaningful contribution was February 2009: HADOOP-2898. Provide an option to specify a port range for Hadoop services provisioned by HOD. Contributed by Peeyush

Re: [VOTE] Abandon failmon Common contrib

2011-02-09 Thread Allen Wittenauer
On Feb 9, 2011, at 9:18 AM, Nigel Daley wrote: I think the PMC should abandon the failmon common contrib component. It's last meaningful contribution was it's original commit in August of 2008: Are there any patches in the patch queue? When it comes to contrib, last commit

Re: blog.hadoop.com

2011-02-07 Thread Allen Wittenauer
On Feb 7, 2011, at 1:06 PM, Doug Cutting wrote: On 02/07/2011 12:37 PM, Chris Douglas wrote: Is the implication that blog.hadoop.com would point there? -C Sure, I'd be happy to redirect that CNAME. The other redirects for hadoop.com already point to hadoop.apache.org. I grabbed the

Re: bringing the codebases back in line

2010-10-22 Thread Allen Wittenauer
On Oct 21, 2010, at 5:51 PM, Eli Collins wrote: - The packaging is Linux specific, we've gotten push back when trying to contribute modifications upstream with Linuxisms since Apache supports non-Linux platforms (namely Solaris). Oh come now Eli. Just say it: I push everyone really

Re: bringing the codebases back in line

2010-10-21 Thread Allen Wittenauer
On Oct 21, 2010, at 12:13 PM, Ian Holsman wrote: Hi guys. I wanted to start a conversation about how we could merge the the cloudera + yahoo distribtutions of hadoop into our codebase, and what would be required. *grabs popcorn*

Re: bringing the codebases back in line

2010-10-21 Thread Allen Wittenauer
On Oct 21, 2010, at 2:53 PM, Ian Holsman wrote: yep.. I've heard it's a source of contention... Sure. Maybe like 8 months ago to anyone who was paying attention. In discussing it with people, I've heard that a major issue (not the only one i'm sure) is lack of resources to actually

Re: apache commons configuration

2010-09-02 Thread Allen Wittenauer
On Sep 2, 2010, at 11:53 AM, Bob Li wrote: Hi experts, For hadoop 0.20.2, we have following task tracker configuration property namemapred.local.dir/name value/data/mapred/local/value /property It seems not what I expected -- the directory /tmp/hadoop-user/mapred/local was

Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset

2010-08-25 Thread Allen Wittenauer
On Aug 25, 2010, at 10:46 AM, Hemanth Yamijala wrote: I do agree that this would be very useful for folks who want security sooner. And the fact that Yahoo! have been running it at scale for a good while now is also assuring. As has been mentioned a few times, part of the security features

Re: Branching and testing strategy for 0.22

2010-08-23 Thread Allen Wittenauer
On Aug 23, 2010, at 3:19 PM, Owen O'Malley wrote: Are there any concerns? Just that 0.21 isn't even out of RC yet and that patches to fix it may get missed.

Re: Data Block Size ?

2010-07-15 Thread Allen Wittenauer
On Jul 15, 2010, at 11:40 AM, Syed Wasti wrote: Will it matter what the data block size is ? Yes. It is recommended to have a block size of 64 MB, but if we want to have the data block size to 128 MB, should this effect the performance ? Yes. FWIW, we run with 128MB. Does the size of

Re: jni files

2010-07-09 Thread Allen Wittenauer
On Jul 9, 2010, at 2:09 AM, amit kumar verma wrote: I think I got a solution. As I read more about hadoop and JNI, I learned that I need to copy jni files to HADOOP_INSTALLATION_DIR//lib/native/Linux-xxx-xxx. lib/native/xxx are for the native compression libraries. They are not for

Re: jni files

2010-07-08 Thread Allen Wittenauer
On Jul 8, 2010, at 1:08 AM, amit kumar verma wrote: DistributedCache.addCacheFile(hdfs://* /192.168.0.153:50075*/libraries/mylib.so.1#mylib.so, conf); Do you actually have asterisks in this? If so, that's the problem.

Re: Why single thread for HDFS?

2010-07-05 Thread Allen Wittenauer
On Jul 5, 2010, at 5:01 PM, elton sky wrote: Well, this sounds good when you have many small files, you concat() them into a big one. I am talking about split a big file into blocks and copy all a few blocks in parallel. Basically, your point is that hadoop dfs -cp is relatively slow and

Re: What are uses of taskTracker and JobTracker services?

2010-06-29 Thread Allen Wittenauer
On Jun 29, 2010, at 8:07 AM, Sarah kho wrote: Hi, Can you please let me know what are tasks that the taskTracker and JobTracker performs? Pretty much the entirety of the MapReduce framework. You can think of it this way: HDFS -- MR NameNode -- JobTracker DataNode -- TaskTracker

Re: Hadoop support for hbase

2010-05-10 Thread Allen Wittenauer
Let me understand this: a) the hbase folks have been required to patch hadoop due to bugs b) they have been doing this for X months now c) we finally have momentum on getting 0.21 out the door d) hey, let's make their life easier and take resources out of 0.21 by creating a branch Are we

Re: Hadoop support for hbase

2010-05-10 Thread Allen Wittenauer
On May 10, 2010, at 10:18 AM, Stack wrote: The above is a fallacious setup. How does a branch in 0.20 detract from the 0.21 momentum (The append feature that we'd work on in 0.20 branch has little relation to how append works in 0.21). There are X amount of hours that people can put into

Re: Hadoop support for hbase

2010-05-10 Thread Allen Wittenauer
On May 10, 2010, at 11:05 AM, Ryan Rawson wrote: That's not how it works though - people have adopted and use Hadoop 0.20 because of the fact that people like Yahoo, Facebook, etc run it on multi-thousand node clusters and have done so for months (or soon to be years now). If you look

Re: Could not obtain block blk_

2010-05-07 Thread Allen Wittenauer
On May 7, 2010, at 3:32 AM, Steve Loughran wrote: Pierre ANCELOT wrote: All my nodes are up, I can get the file through the DFS client... 0.20.2 OK, that's interesting. Try looking at the replication count and upping it, or copying the file to make sure its all there. Or running fsck

Re: JIRA snafu?

2010-04-13 Thread Allen Wittenauer
On Apr 13, 2010, at 9:24 AM, Arun C Murthy wrote: I can no longer grant permission to Apache for my patches on JIRA. Anyone else? I haven't tried, but likely related to issues.apache.org getting compromised.

Re: rack awareness question

2010-04-11 Thread Allen Wittenauer
On Apr 10, 2010, at 7:33 AM, Mag Gam wrote: What is the procedure to do this? Change the topology script, bounce NN and JT. is this possible to do while the cluster is running? No. There is a jira i filed on turning the cache into an actual cache w/TTLs, etc.

Re: [VOTE] Replace HBase with Cassandra

2010-04-01 Thread Allen Wittenauer
I'm turning off my email today. On 4/1/10 8:05 AM, Bill Abernathy billaberna...@ymail.com wrote: I propose a vote to replace HBase as a Hadoop sub-project with Cassandra. Why members should vote for this proposal: * Twitter is using Cassandra * Cassandra sounds like someone you'd

Re: How to Recommission?

2010-03-31 Thread Allen Wittenauer
On 3/31/10 8:12 PM, Zhanlei Ma z...@vmware.com wrote: But how to Recommission? Wish your help. Take them out of dfs.exclude and refreshnodes again.

Re: rack awareness help

2010-03-19 Thread Allen Wittenauer
On 3/19/10 4:32 AM, Mag Gam magaw...@gmail.com wrote: Thanks everyone. I think everyone can agree that this part of the documentation is lacking for hadoop. Can someone please provide be a use case, for example: #server 1 Input script.sh Output rack01 #server 2 Input script.sh

Re: Native libraries on Mac

2010-03-11 Thread Allen Wittenauer
*nods* There is a bug with the bitness detection in general (it breaks horribly on Solaris due to the java executable supporting both). I've started working on a general patch (it a] asks Java what bitness it is and b] asks java which JVM lib to link against), but got sidetracked by switching

Commit Log Hacks ( https://issues.apache.org/jira/browse/HADOOP-6628 )

2010-03-11 Thread Allen Wittenauer
Seeing as how our two non-mainline distributions don't have real release notes published, I've built two quick and dirty perl scripts (yes perl! Fear the ops people!) to take the commit log from CDH and the Yahoo! changes files to spit out Confluence-style wiki output. Changing it to other wiki

Re: Compilation failed when compile hadoop common release-0.20.2

2010-03-08 Thread Allen Wittenauer
On 3/8/10 10:06 AM, Gary Yang garyya...@yahoo.com wrote: Hi Owen, Thanks for the reply. From the link you provided, I found the build instruction. I do not understand the option, -Djava5.home=/usr/local/jdk1.5. Does it mean I have to use JDK 1.5? I read somewhere it suggested to use JDK

Re: Hadoop 0.20.2 - New Mockito jar in lib ? - And discovering the reason for new jars in general

2010-03-04 Thread Allen Wittenauer
On 3/4/10 7:24 AM, Stephen Watt sw...@us.ibm.com wrote: I notice in 0.20.2 we now have a new jar in the lib called mockito-all-1.8.0 (mockito.org states it is a Java Test mock-up framework). In the interest of teaching this man to fish, in cases such as this, how can I determine the reason

Re: rack awareness help

2010-03-03 Thread Allen Wittenauer
On 3/2/10 7:43 PM, Mag Gam magaw...@gmail.com wrote: I have a 5 slave servers and I would like to be rackaware meaning each server represents 1 rack resulting in 5 racks. I have looked around for examples online but could not find anything concrete. Can someone please show me an example on

Re: rack awareness help

2010-03-03 Thread Allen Wittenauer
On 3/3/10 4:11 AM, Mag Gam magaw...@gmail.com wrote: An example would be very helpful. There is only 1 paragraph about this but its far too important not to have an example or two. I covered this in my preso to apachecon last year:

Re: testing if replication working

2010-03-01 Thread Allen Wittenauer
On 2/28/10 11:48 PM, Terrence Martin tmar...@physics.ucsd.edu wrote: Mag Gam wrote: I just setup my first hadoop cluster with 5 nodes. What is the best way to check if replication is really working? I assume the best way is to power down 2 nodes and see if I can still reach my data? Well

Re: Adding hard-disks to an existing HDFS cluster

2010-03-01 Thread Allen Wittenauer
Marc, You might find my preso I did on Hadoop at Apachecon EU last year handy: http://wiki.apache.org/hadoop/HadoopPresentations?action=AttachFiledo=view; target=aw-apachecon-eu-2009.pdf aka http://bit.ly/d3UU4A It talks a bit about the care and feeding of your Hadoop grid, including how to

Re: Release plans

2010-02-18 Thread Allen Wittenauer
On 2/18/10 11:29 AM, Doug Cutting cutt...@apache.org wrote: Eli Collins wrote: What are the current plans for the 21 release? Personally, I'd rather re-branch 21 from trunk in early March. I believe Y!'s security changes will be feature-complete in trunk by then. Symlinks and

Re: [VOTE] Release candidate for Hadoop 0.20.2

2010-02-11 Thread Allen Wittenauer
On 2/11/10 9:00 AM, Owen O'Malley omal...@apache.org wrote: These same problems are present as seen in the branch-20 Hudson build: http://hudson.zones.apache.org/hudson/job/Hadoop-20-Build/67/console The problem on Hudson seems to be that automake 1.9 is not installed. That seems like a

Re: set how much CPU to be utilised by a MapReduce job

2010-01-24 Thread Allen Wittenauer
On 1/24/10 10:33 PM, Naveen Kumar Prasad naveenkum...@huawei.com wrote: If many jobs are running concurrently in Hadoop, how can we set CPU usage for individual tasks. That functionality does not exist.

Re: 0.20.2 HDFS incompatible with 0.20.1

2010-01-05 Thread Allen Wittenauer
On 1/5/10 11:29 AM, Todd Lipcon t...@cloudera.com wrote: 1) Although we certainly do not guarantee wire compatibility between minor versions (0.20 - 0.21) have we previously implied wire compatibility between bugfix releases? IIRC, it has been implied and was a goal but not officially written

Re: How to configure read/write/execute ACLs ?

2009-12-01 Thread Allen Wittenauer
On 12/1/09 10:37 AM, ben.cot...@lehman.com benjamin.cot...@lehman.com wrote: How do I configure Hadoop ACLs to specify a uid's read/write/execute privileges? If I parse your question correctly, you want to limit certain uids to have only be able to read or write certain data? That