[
https://issues.apache.org/jira/browse/HDFS-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Grayson updated HDFS-4829:
-------------------------------
Environment:
OS Centos 6.3 (on Intel Core2 Duo, VMware Player VM running under windows 7)
Testing on both 2.0.0-cdh4.1.1 and 2.0.0-cdh4.1.2
was:
OS Centos 6.3 (on Intel Core2 Duo, VMware Player VM running under windows 7)
Testing on both 2.0.0-cdh4.1.1 and 2.0.1-cdh4.1.2
Affects Version/s: (was: 2.0.1-alpha)
> Strange loss of data displayed in hadoop fs -tail command when data is
> separated by periods?
> --------------------------------------------------------------------------------------------
>
> Key: HDFS-4829
> URL: https://issues.apache.org/jira/browse/HDFS-4829
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.0.0-alpha
> Environment: OS Centos 6.3 (on Intel Core2 Duo, VMware Player VM
> running under windows 7)
> Testing on both 2.0.0-cdh4.1.1 and 2.0.0-cdh4.1.2
> Reporter: Todd Grayson
> Priority: Minor
>
> Strange behavior of the hadoop fs -tail command - its default for output
> seems to be 9 lines of output vs 10 lines of output in the OS version of the
> command (minor issue). The strange thing (bug behavior?) appears to drop the
> initial octect from an IP address when examining a file over HDFS.
> [training@localhost hands-on]$ hadoop fs -tail weblog/access_log
> .190.174.142 - - [03/Dec/2011:13:28:08 -0800] "GET
> /assets/js/javascript_combined.js HTTP/1.1" 200 20404
> 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] "GET
> /assets/img/home-logo.png HTTP/1.1" 200 3892
> 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/019.jpg HTTP/1.1" 200 74446
> 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] "GET
> /images/filmmediablock/360/g_still_04.jpg HTTP/1.1" 200 761555
> 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/07082218.jpg HTTP/1.1" 200 154609
> 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] "GET
> /images/filmpics/0000/2229/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 184976
> 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 60117
> 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] "GET
> /images/filmmediablock/360/Chacha.jpg HTTP/1.1" 200 109379
> 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1" 200 161657
> *When looking at the original log data outside of HDFS with the os version of
> the tail command we see the following*
> [training@localhost hands-on]$ hadoop fs -get weblog/access_log ./
> [training@localhost hands-on]$ tail access_log
> 10.190.174.142 - - [03/Dec/2011:13:28:06 -0800] "GET
> /images/filmpics/0000/2229/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 184976
> 10.190.174.142 - - [03/Dec/2011:13:28:08 -0800] "GET
> /assets/js/javascript_combined.js HTTP/1.1" 200 20404
> 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] "GET
> /assets/img/home-logo.png HTTP/1.1" 200 3892
> 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/019.jpg HTTP/1.1" 200 74446
> 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] "GET
> /images/filmmediablock/360/g_still_04.jpg HTTP/1.1" 200 761555
> 10.190.174.142 - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/07082218.jpg HTTP/1.1" 200 154609
> 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] "GET
> /images/filmpics/0000/2229/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 184976
> 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 60117
> 10.190.174.142 - - [03/Dec/2011:13:28:10 -0800] "GET
> /images/filmmediablock/360/Chacha.jpg HTTP/1.1" 200 109379
> 10.190.174.142 - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1" 200 161657
> When using non ip data seperated by periods, it gets even worse and even more
> data is masked? (same data subtituting names for IP octects). Note we loose
> the first line well into the URI string? *
> [training@localhost hands-on]$ hadoop fs -tail weblog/test_log
> s/javascript_combined.js HTTP/1.1" 200 20404
> larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] "GET
> /assets/img/home-logo.png HTTP/1.1" 200 3892
> larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/019.jpg HTTP/1.1" 200 74446
> larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] "GET
> /images/filmmediablock/360/g_still_04.jpg HTTP/1.1" 200 761555
> larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/07082218.jpg HTTP/1.1" 200 154609
> larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] "GET
> /images/filmpics/0000/2229/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 184976
> larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 60117
> larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] "GET
> /images/filmmediablock/360/Chacha.jpg HTTP/1.1" 200 larry.379
> larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1" 200 161657
> * and verifying what we are looking at in normal tail matches - note the
> first line is not represented in the hadoop fs -tail as its only grabbing 9
> lines instead of 10... as I mentioned before. Align the two text based
> examples along the javascript_combined line. *
> [training@localhost hands-on]$ tail test_log
> larry.billy.will.amy - - [03/Dec/2011:13:28:06 -0800] "GET
> /images/filmpics/0000/2229/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 184976
> larry.billy.will.amy - - [03/Dec/2011:13:28:08 -0800] "GET
> /assets/js/javascript_combined.js HTTP/1.1" 200 20404
> larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] "GET
> /assets/img/home-logo.png HTTP/1.1" 200 3892
> larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/019.jpg HTTP/1.1" 200 74446
> larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] "GET
> /images/filmmediablock/360/g_still_04.jpg HTTP/1.1" 200 761555
> larry.billy.will.amy - - [03/Dec/2011:13:28:09 -0800] "GET
> /images/filmmediablock/360/07082218.jpg HTTP/1.1" 200 154609
> larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] "GET
> /images/filmpics/0000/2229/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 184976
> larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000163.jpg HTTP/1.1" 200 60117
> larry.billy.will.amy - - [03/Dec/2011:13:28:larry.-0800] "GET
> /images/filmmediablock/360/Chacha.jpg HTTP/1.1" 200 larry.379
> larry.billy.will.amy - - [03/Dec/2011:13:28:11 -0800] "GET
> /images/filmmediablock/360/GOEMON-NUKI-000159.jpg HTTP/1.1" 200 161657
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira