yes,just ignore this log. On Mar 2, 2013 7:27 AM, "jamal sasha" <[email protected]> wrote:
> Though it copies.. but it gives this error? > > > On Fri, Mar 1, 2013 at 3:21 PM, jamal sasha <[email protected]> wrote: > >> When I try this.. I get an error >> cat: Unable to write to output stream. >> >> Are these permissions issue >> How do i resolve this? >> THanks >> >> >> On Wed, Feb 20, 2013 at 12:21 PM, Harsh J <[email protected]> wrote: >> >>> No problem JM, I was confused as well. >>> >>> AFAIK, there's no shell utility that can let you specify an offset # >>> of bytes to start off with (similar to skip in dd?), but that can be >>> done from the FS API. >>> >>> On Thu, Feb 21, 2013 at 1:14 AM, Jean-Marc Spaggiari >>> <[email protected]> wrote: >>> > Hi Harsh, >>> > >>> > My bad. >>> > >>> > I read the example quickly and I don't know why I tought you used tail >>> > and not head. >>> > >>> > head will work perfectly. But tail will not since it will need to read >>> > the entier file. My comment was for tail, not for head, and therefore >>> > not application to the example you gave. >>> > >>> > >>> > hadoop fs -cat 100-byte-dfs-file | tail -c 5 > 5-byte-local-file >>> > >>> > Will have to download the entire file. >>> > >>> > Is there a way to "jump" into a certain position in a file and "cat" >>> from there? >>> > >>> > JM >>> > >>> > 2013/2/20, Harsh J <[email protected]>: >>> >> Hi JM, >>> >> >>> >> I am not sure how "dangerous" it is, since we're using a pipe here, >>> >> and as you yourself note, it will only last as long as the last bytes >>> >> have been got and then terminate. >>> >> >>> >> The -cat process will terminate because the >>> >> process we're piping to will terminate first after it reaches its goal >>> >> of -c <N bytes>; so certainly the "-cat" program will not fetch the >>> >> whole file down but it may fetch a few bytes extra over communication >>> >> due to use of read buffers (the extra data won't be put into the >>> target >>> >> file, and get discarded). >>> >> >>> >> We can try it out and observe the "clienttrace" logged >>> >> at the DN at the end of the -cat's read. Here's an example: >>> >> >>> >> I wrote a 1.6~ MB file into a file called "foo.jar", see "bytes" >>> >> below, its ~1.58 MB: >>> >> >>> >> 2013-02-20 23:55:19,777 INFO >>> >> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >>> >> /127.0.0.1:58785, dest: /127.0.0.1:50010, bytes: 1658314, op: >>> >> HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_915204057_1, offset: 0, >>> >> srvID: DS-1092147940-192.168.2.1-50010-1349279636946, blockid: >>> >> BP-1461691939-192.168.2.1-1349279623549:blk_2568668834545125596_73870, >>> >> duration: 192289000 >>> >> >>> >> I ran the command "hadoop fs -cat foo.jar | head -c 5 > foo.xml" to >>> >> store first 5 bytes onto a local file: >>> >> >>> >> Asserting that post command we get 5 bytes: >>> >> ➜ ~ wc -c foo.xml >>> >> 5 foo.xml >>> >> >>> >> Asserting that DN didn't IO-read the whole file, see the read op below >>> >> and its "bytes" parameter, its only about 193 KB, not the whole block >>> >> of 1.58 MB we wrote earlier: >>> >> >>> >> 2013-02-21 00:01:32,437 INFO >>> >> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: >>> >> /127.0.0.1:50010, dest: /127.0.0.1:58802, bytes: 198144, op: >>> >> HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-1698829178_1, offset: 0, >>> >> srvID: DS-1092147940-192.168.2.1-50010-1349279636946, blockid: >>> >> BP-1461691939-192.168.2.1-1349279623549:blk_2568668834545125596_73870, >>> >> duration: 19207000 >>> >> >>> >> I don't see how this is anymore dangerous than doing a >>> >> -copyToLocal/-get, which retrieves the whole file anyway? >>> >> >>> >> On Wed, Feb 20, 2013 at 9:25 PM, Jean-Marc Spaggiari >>> >> <[email protected]> wrote: >>> >>> But be careful. >>> >>> >>> >>> hadoop fs -cat will retrieve the entire file and last only when it >>> >>> will have retrieve the last bytes you are looking for. >>> >>> >>> >>> If your file is many GB big, it will take a lot of time for this >>> >>> command to complete and will put some pressure on your network. >>> >>> >>> >>> JM >>> >>> >>> >>> 2013/2/19, jamal sasha <[email protected]>: >>> >>>> Awesome thanks :) >>> >>>> >>> >>>> >>> >>>> On Tue, Feb 19, 2013 at 2:14 PM, Harsh J <[email protected]> >>> wrote: >>> >>>> >>> >>>>> You can instead use 'fs -cat' and the 'head' coreutil, as one >>> example: >>> >>>>> >>> >>>>> hadoop fs -cat 100-byte-dfs-file | head -c 5 > 5-byte-local-file >>> >>>>> >>> >>>>> On Wed, Feb 20, 2013 at 3:38 AM, jamal sasha < >>> [email protected]> >>> >>>>> wrote: >>> >>>>> > Hi, >>> >>>>> > I was wondering in the following command: >>> >>>>> > >>> >>>>> > bin/hadoop dfs -copyToLocal hdfspath localpath >>> >>>>> > can we have specify to copy not full but like xMB's of file to >>> local >>> >>>>> drive? >>> >>>>> > >>> >>>>> > Is something like this possible >>> >>>>> > Thanks >>> >>>>> > Jamal >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Harsh J >>> >>>>> >>> >>>> >>> >> >>> >> >>> >> >>> >> -- >>> >> Harsh J >>> >> >>> >>> >>> >>> -- >>> Harsh J >>> >> >> >
