Re: hd fs -head?

Keith Wiley Mon, 27 Sep 2010 08:13:41 -0700

On 2010, Sep 27, at 7:02 AM, Edward Capriolo wrote:

On Mon, Sep 27, 2010 at 3:23 AM, Keith Wiley <[email protected]>wrote:
Is there a particularly good reason for why the "hadoop fs" commandsupports
-cat and -tail, but not -head?
Tail is needed to be done efficiently but head you can just do
yourself. Most people probably use

hadoop dfs -cat file | head -5.

I disagree with your use of the word "efficiently". :-) To myunderstanding (and perhaps that's the source of my error), theapproach you suggested reads the entire file over the net from thecluster to your client machine. That file could conceivably be ofHDFS scales (100s of GBs, even TBs wouldn't be uncommon).

What do you think? Am I wrong in my interpretation of how hadoopCat-pipe-head would work?


Cheers!

________________________________________________________________________________

Keith Wiley [email protected] keithwiley.commusic.keithwiley.com

"And what if we picked the wrong religion? Every week, we're justmaking God

madder and madder!"
                                           --  Homer Simpson
________________________________________________________________________________

Re: hd fs -head?

Reply via email to