[jira] [Commented] (HADOOP-15196) Zlib decompression fails when file having trailing garbage
[ https://issues.apache.org/jira/browse/HADOOP-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380701#comment-17380701 ] Hadoop QA commented on HADOOP-15196: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 59s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 42s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 5s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 4s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 23m 2s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 2m 24s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 44s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m 44s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 6s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 22m 6s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 53s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 53s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} | |
[jira] [Commented] (HADOOP-15196) Zlib decompression fails when file having trailing garbage
[ https://issues.apache.org/jira/browse/HADOOP-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380549#comment-17380549 ] Srinivasu Majeti commented on HADOOP-15196: --- Hi [~brahmareddy], Are you going to get this code changes into production ? Or is there another fix for this situation. We might need to add another condition for message before ignoring trailing garbage. {code:java} message.contains("incorrect header check") along with message.contains("unknown compression method") {code} > Zlib decompression fails when file having trailing garbage > -- > > Key: HADOOP-15196 > URL: https://issues.apache.org/jira/browse/HADOOP-15196 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Attachments: HADOOP-15196.patch > > > *When file has trailing garbage gzip will ignore.* > {noformat} > gzip -d 2018011309-js.rishenglipin.com.gz > gzip: 2018011309-js.rishenglipin.com.gz: decompression OK, trailing garbage > ignored > {noformat} > *when we use same file and decompress,we got following.* > {noformat} > 2018-01-13 14:23:43,151 | WARN | task-result-getter-3 | Lost task 0.0 in > stage 345.0 (TID 5686, node-core-gyVYT, executor 3): java.io.IOException: > unknown compression method > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:225) > at > org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15196) Zlib decompression fails when file having trailing garbage
[ https://issues.apache.org/jira/browse/HADOOP-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630551#comment-16630551 ] Vinayakumar B commented on HADOOP-15196: Thanks for the fix [~brahmareddy]. 1. Patch fixes the said issue, except one case. i.e. If BuiltInGZipDecompressor is used, and size of trailing garbage is less than 10 bytes. Below change should be done in {{BuiltInGZipDecompressor#executeHeaderState()}} to handle this case as well. {code:java} @@ -253,8 +266,11 @@ private void executeHeaderState() throws IOException { if (state == GzipStateLabel.HEADER_BASIC) { int n = Math.min(userBufLen, 10-localBufOff); // (or 10-headerBytesRead) checkAndCopyBytesToLocal(n); // modifies userBufLen, etc. - if (localBufOff >= 10) { // should be strictly == + if (localBufOff > 0) { // should be strictly == processBasicHeader(); // sig, compression method, flagbits +if (ignoreTrailingGarbage) { + return; +} localBufOff = 0;// no further need for basic header state = GzipStateLabel.HEADER_EXTRA_FIELD; } {code} 2. Reset the {{newStream}} and {{ignoreTrailingGarbage}} flags if concatenated stream have valid bytes. Changes can be done in {{BuiltInGzipDecompressor#decompress()}} as below. {code:java} @@ -208,6 +216,11 @@ public synchronized int decompress(byte[] b, int off, int len) } catch (DataFormatException dfe) { throw new IOException(dfe.getMessage()); } + if (newSteam) { +//Reset if new stream have valid bytes +newSteam = false; +ignoreTrailingGarbage = false; + } crc.update(b, off, numAvailBytes); // CRC-32 is on _uncompressed_ data if (inflater.finished()) { state = GzipStateLabel.TRAILER_CRC; {code} 3. A test needs to be added to verify this. With both Native and Non-Native decompressors. Creating the gzip file with trailing garbage is very easy. Just create a gzip compressed file and append some extra bytes directly. > Zlib decompression fails when file having trailing garbage > -- > > Key: HADOOP-15196 > URL: https://issues.apache.org/jira/browse/HADOOP-15196 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Attachments: HADOOP-15196.patch > > > *When file has trailing garbage gzip will ignore.* > {noformat} > gzip -d 2018011309-js.rishenglipin.com.gz > gzip: 2018011309-js.rishenglipin.com.gz: decompression OK, trailing garbage > ignored > {noformat} > *when we use same file and decompress,we got following.* > {noformat} > 2018-01-13 14:23:43,151 | WARN | task-result-getter-3 | Lost task 0.0 in > stage 345.0 (TID 5686, node-core-gyVYT, executor 3): java.io.IOException: > unknown compression method > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:225) > at > org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15196) Zlib decompression fails when file having trailing garbage
[ https://issues.apache.org/jira/browse/HADOOP-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589940#comment-16589940 ] genericqa commented on HADOOP-15196: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 37s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 12s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 99m 39s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | HADOOP-15196 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936789/HADOOP-15196.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7d981600a210 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1ac0144 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/15085/testReport/ | | Max. process+thread count | 1570 (vs. ulimit of 1) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15085/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Zlib decompression
[jira] [Commented] (HADOOP-15196) Zlib decompression fails when file having trailing garbage
[ https://issues.apache.org/jira/browse/HADOOP-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589827#comment-16589827 ] Brahma Reddy Battula commented on HADOOP-15196: --- Uploaded the patch to ignore the '*trailing garbage*'. Kindly review. > Zlib decompression fails when file having trailing garbage > -- > > Key: HADOOP-15196 > URL: https://issues.apache.org/jira/browse/HADOOP-15196 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Attachments: HADOOP-15196.patch > > > *When file has trailing garbage gzip will ignore.* > {noformat} > gzip -d 2018011309-js.rishenglipin.com.gz > gzip: 2018011309-js.rishenglipin.com.gz: decompression OK, trailing garbage > ignored > {noformat} > *when we use same file and decompress,we got following.* > {noformat} > 2018-01-13 14:23:43,151 | WARN | task-result-getter-3 | Lost task 0.0 in > stage 345.0 (TID 5686, node-core-gyVYT, executor 3): java.io.IOException: > unknown compression method > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:225) > at > org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15196) Zlib decompression fails when file having trailing garbage
[ https://issues.apache.org/jira/browse/HADOOP-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352081#comment-16352081 ] Brahma Reddy Battula commented on HADOOP-15196: --- Read can be successful with *Hadoop {{text}} command* *for the same file* which uses the {{GZIPInputStream from JDK.}} But Using GzipCodec, decompressing is failing In both native and non-native cases. *So the behavior of Text command is same as "gzip" Linux command, which ignores the trailing garbage.* {color:#FF}*But GZipCodec impl will not ignore and fails the whole operation.*{color} > Zlib decompression fails when file having trailing garbage > -- > > Key: HADOOP-15196 > URL: https://issues.apache.org/jira/browse/HADOOP-15196 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > > *When file has trailing garbage gzip will ignore.* > {noformat} > gzip -d 2018011309-js.rishenglipin.com.gz > gzip: 2018011309-js.rishenglipin.com.gz: decompression OK, trailing garbage > ignored > {noformat} > *when we use same file and decompress,we got following.* > {noformat} > 2018-01-13 14:23:43,151 | WARN | task-result-getter-3 | Lost task 0.0 in > stage 345.0 (TID 5686, node-core-gyVYT, executor 3): java.io.IOException: > unknown compression method > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native > Method) > at > org.apache.hadoop.io.compress.zlib.ZlibDecompressor.decompress(ZlibDecompressor.java:225) > at > org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org