[jira] [Commented] (COMPRESS-467) Could not unzip the file properly
[ https://issues.apache.org/jira/browse/COMPRESS-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656266#comment-16656266 ] Stefan Bodewig commented on COMPRESS-467: - I see. Your file consists of multiple bz2 streams, you need to enable "decompressConcatenated" be using the two-arg constructor of {{BZip2CompressorInputStream}} passing in {{true}} as the second argument. I.e. {code} public static void main(String[] args) throws Exception { try (BZip2CompressorInputStream i = new BZip2CompressorInputStream(new FileInputStream("HS_H08_20180927_0600_B02_FLDK_R10_S0110.DAT.bz2"), true); OutputStream o = new FileOutputStream("test")) { IOUtils.copy(i, o); } } {code} The result I get is identical to the one {{bunzip2}} creates on my Linux system. > Could not unzip the file properly > - > > Key: COMPRESS-467 > URL: https://issues.apache.org/jira/browse/COMPRESS-467 > Project: Commons Compress > Issue Type: Bug > Environment: Window 7 >Reporter: Yang Lin >Priority: Major > Labels: windows > Attachments: HS_H08_20180927_0600_B02_FLDK_R10_S0110.DAT.bz2 > > > When I use WinRAR unzip one file, I can get the result file properly which > size is about 24,200,000 bytes. But I use commons-compress unzip this file, > can only get 900,000 bytes(Operation process: sourceFile -> byte[] -> > InputStream). In another way to unzip this file using commons-compress, > sourceFile -> InputStream, throw Exception: java.io.IOException: Stream is > not in the BZip2 format, at > org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.init(BZip2CompressorInputStream.java:261). > Thank you in advance! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (COMPRESS-467) Could not unzip the file properly
[ https://issues.apache.org/jira/browse/COMPRESS-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656172#comment-16656172 ] Yang Lin commented on COMPRESS-467: --- Thank you. My version is 1.18 too. And I run your code, no any exception. But, the result file's size is only 900,000 bytes. Now, I invoke WinRAR to decompress those bz2-files with Windows CMD, but it doesn't look very professional. @Stefan Bodewig > Could not unzip the file properly > - > > Key: COMPRESS-467 > URL: https://issues.apache.org/jira/browse/COMPRESS-467 > Project: Commons Compress > Issue Type: Bug > Environment: Window 7 >Reporter: Yang Lin >Priority: Major > Labels: windows > Attachments: HS_H08_20180927_0600_B02_FLDK_R10_S0110.DAT.bz2 > > > When I use WinRAR unzip one file, I can get the result file properly which > size is about 24,200,000 bytes. But I use commons-compress unzip this file, > can only get 900,000 bytes(Operation process: sourceFile -> byte[] -> > InputStream). In another way to unzip this file using commons-compress, > sourceFile -> InputStream, throw Exception: java.io.IOException: Stream is > not in the BZip2 format, at > org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.init(BZip2CompressorInputStream.java:261). > Thank you in advance! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (VFS-676) VFS does not support several Zip file compression methods
[ https://issues.apache.org/jira/browse/VFS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655655#comment-16655655 ] Gary Gregory commented on VFS-676: -- Patches via github PRs are more than welcome ;-) > VFS does not support several Zip file compression methods > - > > Key: VFS-676 > URL: https://issues.apache.org/jira/browse/VFS-676 > Project: Commons VFS > Issue Type: Improvement >Affects Versions: 2.2, 2.2.1 >Reporter: Sérgio Ribeiro >Priority: Major > > VFS currently bases its ZipFileSystem implementation on the "java.util.zip" > package, this makes it support *only* the +STORED+ and +DEFLATE+ compression > methods. > If it were using *Apache Commons Compression*, it would support much more > (check > [https://commons.apache.org/proper/commons-compress/zip.html#encryption)|https://commons.apache.org/proper/commons-compress/zip.html#encryption);]. > At the time this issue is being created the supported methods for version > 1.18 are: STORED, DEFLATE, SHRINK, IMPLODE, BZIP2, DEFLATE64 (enhanced > deflate). > > It would be nice to have one of the following options: > # Change the current implementation to use Apache Commons Compression (this > would, most probably, make Apache Commons Compression a non-optional > dependency); > # Add an Apache Commons Compression based provider and have some (easy) way > of letting the user decide which implementation to use. > > The first option would make Apache Commons Compress a *non-optional* > dependency while the second one could allow it to continue as optional. > Currently, if Apache Commons Compress is available, one has access to both > Bzip2FileProvider and TarFileProvider - a strategy that could be used to > decide which ZipFileSystem implementation to use. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (VFS-676) VFS does not support several Zip file compression methods
Sérgio M. M. Ribeiro created VFS-676: Summary: VFS does not support several Zip file compression methods Key: VFS-676 URL: https://issues.apache.org/jira/browse/VFS-676 Project: Commons VFS Issue Type: Improvement Affects Versions: 2.2, 2.2.1 Reporter: Sérgio M. M. Ribeiro VFS currently bases its ZipFileSystem implementation on the "java.util.zip" package, this makes it support *only* the +STORED+ and +DEFLATE+ compression methods. If it were using *Apache Commons Compression*, it would support much more (check [https://commons.apache.org/proper/commons-compress/zip.html#encryption)|https://commons.apache.org/proper/commons-compress/zip.html#encryption);]. At the time this issue is being created the supported methods for version 1.18 are: STORED, DEFLATE, SHRINK, IMPLODE, BZIP2, DEFLATE64 (enhanced deflate). It would be nice to have one of the following options: # Change the current implementation to use Apache Commons Compression (this would, most probably, make Apache Commons Compression a non-optional dependency); # Add an Apache Commons Compression based provider and have some (easy) way of letting the user decide which implementation to use. The first option would make Apache Commons Compress a *non-optional* dependency while the second one could allow it to continue as optional. Currently, if Apache Commons Compress is available, one has access to both Bzip2FileProvider and TarFileProvider - a strategy that could be used to decide which ZipFileSystem implementation to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (MATH-1471) BicubicInterpolatingFunction not interpolating correctly for non discrete y value
[ https://issues.apache.org/jira/browse/MATH-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom closed MATH-1471. - Resolution: Invalid > BicubicInterpolatingFunction not interpolating correctly for non discrete y > value > - > > Key: MATH-1471 > URL: https://issues.apache.org/jira/browse/MATH-1471 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: JDK 1.8.0_181 >Reporter: Tom >Priority: Major > Attachments: ApacheCommonsMathBiInterpolationTests.zip, > Interpolate.java, InterpolateTest.java > > > Upon performing a bicubic interpolation with two point (x0, y0) and (x1, y1), > the returned bicubic interpolating function returned returns the same result > for variations in the estimated y value. > For example, my inputs are (20, 20) and (25, 25) with f(20, 20) = 64 and > f(25, 25) = 6468. > When I get the bicubic interpolating function for this and vary the estimated > x, it works fine. For (21, 20), the function returns 730.016. When I input > (20, 21), the function returns 64, which is f(20, 20). For any y value in > between 20 and 25, the result is 64. This is the case for any function for > which the y estimate is different from the value on the points. > In other instances, it is varying x values that result in the same result > while varying y estimates seem to work as expected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (TEXT-96) Convenience methods needed for RandomStringGenerator
[ https://issues.apache.org/jira/browse/TEXT-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher closed TEXT-96. - Resolution: Fixed Fix Version/s: (was: 1.x) I'm closing this issue as commons-lang RandomStringUtils was "undeprecated" and some work has been done to make RandomStringGenerator easier to use. > Convenience methods needed for RandomStringGenerator > > > Key: TEXT-96 > URL: https://issues.apache.org/jira/browse/TEXT-96 > Project: Commons Text > Issue Type: Improvement >Affects Versions: 1.1 >Reporter: Peter Phillips >Priority: Minor > > {{RandomStringGenerator}} is extremely verbose compared to the deprecated > commons.lang3 {{RandomStringUtils}}. > Previously we could write: > {code:java} > RandomStringUtils.randomNumeric(10) > {code} > to generate a numeric string whereas this now has become: > {code:java} > new RandomStringGenerator.Builder().withinRange('0', '9').build().generate(10) > {code} > although in practice we would then also use static imports too. > The {{randomAlphabetic}} conversion is even more verbose: > {code:java} > new RandomStringGenerator.Builder().withinRange('A', 'z').filteredBy(new > CharacterPredicate() { > @Override > public boolean test(int codePoint) { > return codePoint >= 'a' || codePoint <= 'Z'; > } > }).build().generate(10)) > {code} and at that point I lost enthusiam with trying to replicate > {{randomAlphanumeric}}. > I don't think the average java developer would understand what a code point > is in the first place so then trying to get our automation testers to use the > new API to implement random alphanumeric character generation would be > difficult. > I therefore suggest that commons-text should have a copy of > {{RandomStringUtils}} which can even delegate to {{RandomStringGenerator}} or > alternatively convenience static methods for the common use cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (TEXT-134) Add a Properties file string lookup
[ https://issues.apache.org/jira/browse/TEXT-134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher closed TEXT-134. -- Resolution: Fixed Fix Version/s: (was: 1.6) 1.5 I'm closing this issue as it was released as part of 1.5. Changes should be discussed in a new issue imho. > Add a Properties file string lookup > --- > > Key: TEXT-134 > URL: https://issues.apache.org/jira/browse/TEXT-134 > Project: Commons Text > Issue Type: New Feature >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Major > Fix For: 1.5 > > > Add a Properties file String Lookup to look up the value for a given key in > the format "Document:Key". > For example: "com/domain/document.properties::key". > Note the use of the separator "::" instead of ":" to allow for path > containing ":" like "C:\path\to\file.properties". > The lookup key is "properties". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (TEXT-137) Add a URL string lookup
[ https://issues.apache.org/jira/browse/TEXT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher closed TEXT-137. -- Resolution: Fixed I'm closing this issue as it was released as part of 1.5. Removing/Changing it should be part of new issue imho. > Add a URL string lookup > --- > > Key: TEXT-137 > URL: https://issues.apache.org/jira/browse/TEXT-137 > Project: Commons Text > Issue Type: New Feature >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Major > Fix For: 1.5 > > > Add a simple URL string lookup to read the contents of a URL. This is _not_ a > full fledged HTTP client. > For example, using the HTTP scheme: > "UTF-8:[http://www.google.com|http://www.google.com/]; > For example, using the file scheme: > "UTF-8:file:///C:/somehome/commons/commons-text/src/test/resources/document.properties" > The URL lookup key is "url" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (TEXT-135) Add a script string lookup
[ https://issues.apache.org/jira/browse/TEXT-135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher closed TEXT-135. -- Resolution: Fixed Fix Version/s: 1.5 > Add a script string lookup > -- > > Key: TEXT-135 > URL: https://issues.apache.org/jira/browse/TEXT-135 > Project: Commons Text > Issue Type: New Feature >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Major > Fix For: 1.5 > > > Add a script file string lookup. > Example key: "javascript:\"Hello World!\"" > The lookup key is "script". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEXT-137) Add a URL string lookup
[ https://issues.apache.org/jira/browse/TEXT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher updated TEXT-137: --- Fix Version/s: 1.5 > Add a URL string lookup > --- > > Key: TEXT-137 > URL: https://issues.apache.org/jira/browse/TEXT-137 > Project: Commons Text > Issue Type: New Feature >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Major > Fix For: 1.5 > > > Add a simple URL string lookup to read the contents of a URL. This is _not_ a > full fledged HTTP client. > For example, using the HTTP scheme: > "UTF-8:[http://www.google.com|http://www.google.com/]; > For example, using the file scheme: > "UTF-8:file:///C:/somehome/commons/commons-text/src/test/resources/document.properties" > The URL lookup key is "url" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEXT-135) Add a script string lookup
[ https://issues.apache.org/jira/browse/TEXT-135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher updated TEXT-135: --- Fix Version/s: (was: 1.6) > Add a script string lookup > -- > > Key: TEXT-135 > URL: https://issues.apache.org/jira/browse/TEXT-135 > Project: Commons Text > Issue Type: New Feature >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Major > > Add a script file string lookup. > Example key: "javascript:\"Hello World!\"" > The lookup key is "script". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (TEXT-131) JaroWinklerDistance: Calculation deviates from definition
[ https://issues.apache.org/jira/browse/TEXT-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher closed TEXT-131. -- Resolution: Fixed Fix Version/s: 1.5 > JaroWinklerDistance: Calculation deviates from definition > - > > Key: TEXT-131 > URL: https://issues.apache.org/jira/browse/TEXT-131 > Project: Commons Text > Issue Type: Bug >Affects Versions: 1.4 >Reporter: Jan Martin Keil >Assignee: Rob Tompkins >Priority: Major > Fix For: 1.5 > > > The calculation in {{JaroWinklerDistance}} deviates from the definition of > the Jaro-Winkler Similarity. By definition the common prefix length is only > determine for the first 4 characters. Further, the JaroWinkler is defined as > {{JaroSimilarity + ScalingFactor * CommonPrefixLength * (1 - JaroSimilarity > )}}. > Therefore, I recommend the following changes: > # Update Jaro-Winkler Similarity calculation > {code:java} > final double jw = j < 0.7D ? j : j + Math.min(defaultScalingFactor, 1D / > mtp[3]) * mtp[2] * (1D - j); > {code} > to > {code:java} > final double jw = j < 0.7D ? j : j + defaultScalingFactor * mtp[2] * (1D - j); > {code} > # Update calculation of Common Prefix Length > {code:java} > for (int mi = 0; mi < min.length(); mi++) { > {code} > to > {code:java} > for (int mi = 0; mi < Math.min(4, min.length()); mi++) { > {code} > # Remove unnecessary return value > {code:java} > return new int[] {matches, transpositions, prefix, max.length()}; > {code} > to > {code:java} > return new int[] {matches, transpositions, prefix}; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEXT-131) JaroWinklerDistance: Calculation deviates from definition
[ https://issues.apache.org/jira/browse/TEXT-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher updated TEXT-131: --- Affects Version/s: 1.4 > JaroWinklerDistance: Calculation deviates from definition > - > > Key: TEXT-131 > URL: https://issues.apache.org/jira/browse/TEXT-131 > Project: Commons Text > Issue Type: Bug >Affects Versions: 1.4 >Reporter: Jan Martin Keil >Assignee: Rob Tompkins >Priority: Major > > The calculation in {{JaroWinklerDistance}} deviates from the definition of > the Jaro-Winkler Similarity. By definition the common prefix length is only > determine for the first 4 characters. Further, the JaroWinkler is defined as > {{JaroSimilarity + ScalingFactor * CommonPrefixLength * (1 - JaroSimilarity > )}}. > Therefore, I recommend the following changes: > # Update Jaro-Winkler Similarity calculation > {code:java} > final double jw = j < 0.7D ? j : j + Math.min(defaultScalingFactor, 1D / > mtp[3]) * mtp[2] * (1D - j); > {code} > to > {code:java} > final double jw = j < 0.7D ? j : j + defaultScalingFactor * mtp[2] * (1D - j); > {code} > # Update calculation of Common Prefix Length > {code:java} > for (int mi = 0; mi < min.length(); mi++) { > {code} > to > {code:java} > for (int mi = 0; mi < Math.min(4, min.length()); mi++) { > {code} > # Remove unnecessary return value > {code:java} > return new int[] {matches, transpositions, prefix, max.length()}; > {code} > to > {code:java} > return new int[] {matches, transpositions, prefix}; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEXT-130) JaroWinklerDistance: Wrong results due to precision of transpositions
[ https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher updated TEXT-130: --- Affects Version/s: 1.4 > JaroWinklerDistance: Wrong results due to precision of transpositions > - > > Key: TEXT-130 > URL: https://issues.apache.org/jira/browse/TEXT-130 > Project: Commons Text > Issue Type: Bug >Affects Versions: 1.4 >Reporter: Jan Martin Keil >Assignee: Rob Tompkins >Priority: Major > Fix For: 1.5 > > > The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as > integer. However, it is not granted for {{transpositions}} to be even. E.g. > comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. > Therefore the method must return 1.5, not 1. Otherwise the similarity is > 0.9611 instead of 0.9417. > I recommend to return {{halfTranspositions}} instead of {{transpositions}} > and doing the cast and division ({{(double) mtp[1] / 2}}) in > {{JaroWinklerDistance#apply}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (TEXT-130) JaroWinklerDistance: Wrong results due to precision of transpositions
[ https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal Schumacher closed TEXT-130. -- Resolution: Fixed Fix Version/s: 1.5 > JaroWinklerDistance: Wrong results due to precision of transpositions > - > > Key: TEXT-130 > URL: https://issues.apache.org/jira/browse/TEXT-130 > Project: Commons Text > Issue Type: Bug >Affects Versions: 1.4 >Reporter: Jan Martin Keil >Assignee: Rob Tompkins >Priority: Major > Fix For: 1.5 > > > The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as > integer. However, it is not granted for {{transpositions}} to be even. E.g. > comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. > Therefore the method must return 1.5, not 1. Otherwise the similarity is > 0.9611 instead of 0.9417. > I recommend to return {{halfTranspositions}} instead of {{transpositions}} > and doing the cast and division ({{(double) mtp[1] / 2}}) in > {{JaroWinklerDistance#apply}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)