Supposing you do have your part-r-XXXX fully ordered

you can do

hadoop dfs -cat "output/solr/part-*" > yourLocalFile

tada :)

Cheers
Olivier


Alex Parvulescu a écrit :
Hello,

one minor correction.

I'm talking about 'hadoop dfs -getmerge' . You are right, '-cat' is the equivalent of '-get' and they both handle only files.

I'd like to see an equivalent of 'getmerge' to stdout.

sorry for the confusion
alex

On Tue, Mar 16, 2010 at 11:31 AM, Alex Parvulescu <alex.parvule...@gmail.com <mailto:alex.parvule...@gmail.com>> wrote:

    Hello Olivier,

    I've tried 'cat'. This is the error I get: 'cat: Source must be a file.'
    This happens when I try to get all parts from a directory as a
    single .csv file.

    Something like that:
      hadoop dfs -cat hdfs://master:54310/user/hadoop-user/output/solr/
      cat: Source must be a file.
This is what the dir looks like
      hadoop dfs -ls hdfs://master:54310/user/hadoop-user/output/solr/
      Found 3 items
      drwxr-xr-x   - hadoop supergroup          0 2010-03-12 16:36
    /user/hadoop-user/output/solr/_logs
      -rw-r--r--   2 hadoop supergroup   64882566 2010-03-12 16:36
    /user/hadoop-user/output/solr/part-00000
      -rw-r--r--   2 hadoop supergroup   51388943 2010-03-12 16:36
    /user/hadoop-user/output/solr/part-00001

    It seems -get can merge everything to one file, but cannot output to
    sdtout while 'cat' can do stdout, but it seems I have to fetch the
    parts one by one.

    Or am I missing something?

    thanks,
    alex


    On Tue, Mar 16, 2010 at 11:28 AM, Varene Olivier <var...@echo.fr
    <mailto:var...@echo.fr>> wrote:

        Hello Alex,

        get writes down a file on your FileSystem

        hadoop dfs [-get [-ignoreCrc] [-crc] <src> <localdst>]

        with
         src : your file in your hdfs
         localdst : the name of the file with the collected data (from
        src) on
            your local filesystem


        To get the results to STDOUT,
        you can use cat

        hadoop dfs [-cat <src>]

        with src : your file in your hdfs

        Regards
        Olivier

        Alex Parvulescu a écrit :

            Hello,

            Is there a reason for which 'hadoop dfs -get' will not
            output to stdout?

            I see 'hadoop dfs -put' can handle stdin.  It would seem
            that dfs would have to also support outputing to stdout.


            thanks,
            alex





Reply via email to