On 2010, Sep 27, at 7:02 AM, Edward Capriolo wrote:

On Mon, Sep 27, 2010 at 3:23 AM, Keith Wiley <[email protected]> wrote:
Is there a particularly good reason for why the "hadoop fs" command supports
-cat and -tail, but not -head?


Tail is needed to be done efficiently but head you can just do
yourself. Most people probably use

hadoop dfs -cat file | head -5.


I disagree with your use of the word "efficiently". :-) To my understanding (and perhaps that's the source of my error), the approach you suggested reads the entire file over the net from the cluster to your client machine. That file could conceivably be of HDFS scales (100s of GBs, even TBs wouldn't be uncommon).

What do you think? Am I wrong in my interpretation of how hadoopCat- pipe-head would work?

Cheers!

________________________________________________________________________________
Keith Wiley [email protected] keithwiley.com music.keithwiley.com

"And what if we picked the wrong religion? Every week, we're just making God
madder and madder!"
                                           --  Homer Simpson
________________________________________________________________________________

Reply via email to