[jira] [Commented] (CSV-196) Store the information of raw data read by lexer

2018-08-08 Thread Serge P. Nekoval (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574019#comment-16574019
 ] 

Serge P. Nekoval commented on CSV-196:
--

FYI I've submitted a patch CSV-229 with similar feature. Not sure how it 
compares.

> Store the information of raw data read by lexer
> ---
>
> Key: CSV-196
> URL: https://issues.apache.org/jira/browse/CSV-196
> Project: Commons CSV
>  Issue Type: Improvement
>  Components: Parser
>Affects Versions: 1.4
>Reporter: Matt Sun
>Priority: Major
>  Labels: patch
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> It will be good to have CSVParser class to store the info of whether a field 
> was enclosed by quotes in the original source file.
> For example, for this data sample:
> A, B, C
> a1, "b1", c1
> CSVParser gives us record a1, b1, c1, which is helpful because it parsed 
> double quotes, but we also lost the information of original data at the same 
> time. We can't tell from the CSVRecord returned whether the original data is 
> enclosed by double quotes or not.
> In our use case, we are integrating Apache Hadoop APIs with Commons CSV.  CSV 
> is one kind of input of Hadoop Jobs, which should support splitting input 
> data. To accurately split a CSV file into pieces, we need to count the bytes 
> of  data CSVParser actually read. CSVParser doesn't have accurate information 
> of whether a field was enclosed by quotes, neither does it store raw data of 
> the original source. Downstream users of commons CSVParser is not able to get 
> those info.
> To suggest a fix: Extend the token/CSVRecord to have a boolean field 
> indicating whether the column was enclosed by quotes. While Lexer is doing 
> getNextToken, set the flag if a field is encapsulated and successfully parsed.
> I find another issue reported with similar request, but it was marked as 
> resolved: [CSV91] 
> https://issues.apache.org/jira/browse/CSV-91?jql=project%20%3D%20CSV%20AND%20text%20~%20%22with%20quotes%22



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (VFS-360) Migrate to HttpComponent HttpClient from the old Commons HttpClient

2018-08-08 Thread Woonsan Ko (JIRA)


[ 
https://issues.apache.org/jira/browse/VFS-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573266#comment-16573266
 ] 

Woonsan Ko commented on VFS-360:


While contributing with JCR-4354 (and JCR-3975), I've bumped on this. :-)
As Jackrabbit WebDAV module upgraded it to HttpComponents 4.x already, the 
webdav: protocol doesn't work either with the latest Jackrabbit module, which I 
have to use for the new Jackrabbit features.
Anyway, I'd like to help with this by working and providing PRs. Stay tuned!

Thanks!

Woonsan

P.S. If this issue gets resolved, the next step will be to upgrade the 
Jackrabbit WebDAV dependency for webdav: protocol accordingly.

> Migrate to HttpComponent HttpClient from the old Commons HttpClient
> ---
>
> Key: VFS-360
> URL: https://issues.apache.org/jira/browse/VFS-360
> Project: Commons VFS
>  Issue Type: Improvement
>Reporter: Gary Gregory
>Priority: Major
>
> Migrate to HttpComponent HttpClient from the old Commons HttpClient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (POOL-348) The commons-pool-evictor-thread should run as a Deamon

2018-08-08 Thread Thomas Neerup (JIRA)
Thomas Neerup created POOL-348:
--

 Summary: The commons-pool-evictor-thread should run as a Deamon
 Key: POOL-348
 URL: https://issues.apache.org/jira/browse/POOL-348
 Project: Commons Pool
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Thomas Neerup


The thread "commons-pool-evictor-thread" does not run as a Deamon and keeps the 
JVM alive when all other non Deamon threads has ended.

Is there any reason for this behaviour? Otherwise the thread should be started 
as a Deamon.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEXT-131) JaroWinklerDistance: Calculation deviates from definition

2018-08-08 Thread Rob Tompkins (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEXT-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Tompkins updated TEXT-131:
--
Assignee: Rob Tompkins

> JaroWinklerDistance: Calculation deviates from definition
> -
>
> Key: TEXT-131
> URL: https://issues.apache.org/jira/browse/TEXT-131
> Project: Commons Text
>  Issue Type: Bug
>Reporter: Jan Martin Keil
>Assignee: Rob Tompkins
>Priority: Major
>
> The calculation in {{JaroWinklerDistance}} deviates from the definition of 
> the Jaro-Winkler Similarity. By definition the common prefix length is only 
> determine for the first 4 characters. Further, the JaroWinkler is defined as 
> {{JaroSimilarity + ScalingFactor * CommonPrefixLength * (1 - JaroSimilarity 
> )}}.
>  Therefore, I recommend the following changes:
>  # Update Jaro-Winkler Similarity calculation
> {code:java}
> final double jw = j < 0.7D ? j : j + Math.min(defaultScalingFactor, 1D / 
> mtp[3]) * mtp[2] * (1D - j);
> {code}
> to
> {code:java}
> final double jw = j < 0.7D ? j : j + defaultScalingFactor * mtp[2] * (1D - j);
> {code}
>  # Update calculation of Common Prefix Length
> {code:java}
> for (int mi = 0; mi < min.length(); mi++) {
> {code}
> to
> {code:java}
> for (int mi = 0; mi < Math.min(4, min.length()); mi++) {
> {code}
>  # Remove unnecessary return value
> {code:java}
> return new int[] {matches, transpositions, prefix, max.length()};
> {code}
> to
> {code:java}
> return new int[] {matches, transpositions, prefix};
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEXT-130) JaroWinklerDistance: Wrong results due to precision of transpositions

2018-08-08 Thread Rob Tompkins (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Tompkins updated TEXT-130:
--
Assignee: Rob Tompkins

> JaroWinklerDistance: Wrong results due to precision of transpositions
> -
>
> Key: TEXT-130
> URL: https://issues.apache.org/jira/browse/TEXT-130
> Project: Commons Text
>  Issue Type: Bug
>Reporter: Jan Martin Keil
>Assignee: Rob Tompkins
>Priority: Major
>
> The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as 
> integer. However, it is not granted for {{transpositions}} to be even. E.g. 
> comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. 
> Therefore the method must return 1.5, not 1. Otherwise the similarity is 
> 0.9611 instead of 0.9417.
> I recommend to return {{halfTranspositions}} instead of {{transpositions}} 
> and doing the cast and division ({{(double) mtp[1] / 2}}) in 
> {{JaroWinklerDistance#apply}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEXT-130) JaroWinklerDistance: Wrong results due to precision of transpositions

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573487#comment-16573487
 ] 

ASF GitHub Bot commented on TEXT-130:
-

Github user asfgit closed the pull request at:

https://github.com/apache/commons-text/pull/87


> JaroWinklerDistance: Wrong results due to precision of transpositions
> -
>
> Key: TEXT-130
> URL: https://issues.apache.org/jira/browse/TEXT-130
> Project: Commons Text
>  Issue Type: Bug
>Reporter: Jan Martin Keil
>Assignee: Rob Tompkins
>Priority: Major
>
> The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as 
> integer. However, it is not granted for {{transpositions}} to be even. E.g. 
> comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. 
> Therefore the method must return 1.5, not 1. Otherwise the similarity is 
> 0.9611 instead of 0.9417.
> I recommend to return {{halfTranspositions}} instead of {{transpositions}} 
> and doing the cast and division ({{(double) mtp[1] / 2}}) in 
> {{JaroWinklerDistance#apply}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (LANG-1408) NumberUtils: Rounding utilities for BigDecimal to primitive double avoiding NPEs.

2018-08-08 Thread Rob Tompkins (JIRA)
Rob Tompkins created LANG-1408:
--

 Summary: NumberUtils: Rounding utilities for BigDecimal to 
primitive double avoiding NPEs.
 Key: LANG-1408
 URL: https://issues.apache.org/jira/browse/LANG-1408
 Project: Commons Lang
  Issue Type: Improvement
Reporter: Rob Tompkins
Assignee: Rob Tompkins
 Fix For: 3.8


For the sake of formatting rounded {{BigDecimal}} values in JSON/XML I'm 
looking for the following methods:

1. 
{code:java}
public static double toDouble(BigDecimal value);
{code}
that defaults to {{0}} when {{value}} is null.

2.
{code:java}
public static double toDouble(BigDecimal value, double defaultValue);
{code}
that essentially does the same as 1. but accommodates for default values being 
specified.

3. 
{code:java}
public static BigDecimal toScaledBigDecimal(BigDecimal value, Integer scale, 
RoundingMode roudingMode);
that converts a {{BigDecimal}} to a {{BigDecimal}} whose scale is the specified 
value with input rounding mode applied
{code}

4. 
{code:java}
public static BigDecimal toScaledBigDecimal(BigDecimal value);
{code}
that converts a {{BigDecimal}} to a {{BigDecimal}} whose scale is 2 with 
{{RoundingMode.HALF_UP}} rounding mode applied.

5.
{code:java}
public static BigDecimal toScaledBigDecimal(Float value, Integer scale, 
RoundingMode roudingMode);
{code}
that converts a {{Float}} to a {{BigDecimal}} whose scale is the specified 
value with input rounding mode applied.

6.
{code:java}
public static BigDecimal toScaledBigDecimal(Double value, Integer scale, 
RoundingMode roudingMode);
{code}
that converts a {{Double}} to a {{BigDecimal}}  whose scale is the specified 
value with input rounding mode applied.

7.
{code:java}
public static BigDecimal toScaledBigDecimal(Double value);
{code}
that converts a {{Double}} to a {{BigDecimal}} whose scale is 2 with 
{{RoundingMode.HALF_UP}} rounding mode applied.

8.
{code:java}
public static BigDecimal toScaledBigDecimal(String value, Integer scale, 
RoundingMode roudingMode);
{code}
that converts a {{String}} to a {{BigDecimal}} whose scale is the specified 
value with input rounding mode applied

9.
{code:java}
public static BigDecimal toScaledBigDecimal(String value);
{code}
that converts a {{String}} to a {{BigDecimal}} whose scale is 2 with 
{{RoundingMode.HALF_UP}} rounding mode applied



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread HiuFung Kwok (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573108#comment-16573108
 ] 

HiuFung Kwok edited comment on LANG-1406 at 8/8/18 12:04 PM:
-

Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String object 
([ref|https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode]
 ), String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.

Fixture: 
[https://github.com/HiuKwok/commons-lang/commit/e0f6c7802b5e721019a602bf30b31c79dbf6d233]

Testcase: 
https://github.com/HiuKwok/commons-lang/commit/590f90889bf61a5570bd98b78e73410a07d7410b

 

 


was (Author: hiukwok):
Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String 
object[ref|https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode],
 String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.

Fixture: 
[https://github.com/HiuKwok/commons-lang/commit/e0f6c7802b5e721019a602bf30b31c79dbf6d233]

Testcase: 
https://github.com/HiuKwok/commons-lang/commit/590f90889bf61a5570bd98b78e73410a07d7410b

 

 

> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread HiuFung Kwok (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573108#comment-16573108
 ] 

HiuFung Kwok edited comment on LANG-1406 at 8/8/18 12:04 PM:
-

Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String 
object[ref|https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode],
 String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.

Fixture: 
[https://github.com/HiuKwok/commons-lang/commit/e0f6c7802b5e721019a602bf30b31c79dbf6d233]

Testcase: 
https://github.com/HiuKwok/commons-lang/commit/590f90889bf61a5570bd98b78e73410a07d7410b

 

 


was (Author: hiukwok):
Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String 
object([ref|[https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode])],
 String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.

Fixture: 
[https://github.com/HiuKwok/commons-lang/commit/e0f6c7802b5e721019a602bf30b31c79dbf6d233]

Testcase: 
https://github.com/HiuKwok/commons-lang/commit/590f90889bf61a5570bd98b78e73410a07d7410b

 

 

> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573123#comment-16573123
 ] 

ASF GitHub Bot commented on LANG-1406:
--

GitHub user HiuKwok opened a pull request:

https://github.com/apache/commons-lang/pull/340

[LANG-1406]  StringIndexOutOfBoundsException in 
StringUtils.replaceIgnoreCase

Fix for Lang-1406 to avoid any exception while performing 
String.UTils.replaceIgnoreCase() against uniCode String object. 

Plz let me if there have any extra things need to be done for this PR since 
I am the first timer for commons-lang project( add more test case?).

All the best

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HiuKwok/commons-lang master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/commons-lang/pull/340.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #340


commit e0f6c7802b5e721019a602bf30b31c79dbf6d233
Author: Hiu Kwok 
Date:   2018-08-08T11:44:51Z

toUpperCase() > toLowerCase() to avoid unicode string length miscalculation

commit 590f90889bf61a5570bd98b78e73410a07d7410b
Author: Hiu Kwok 
Date:   2018-08-08T11:46:29Z

Assertion for example mentioned on LANG-1406 Description




> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] commons-lang pull request #340: [LANG-1406] StringIndexOutOfBoundsException ...

2018-08-08 Thread HiuKwok
GitHub user HiuKwok opened a pull request:

https://github.com/apache/commons-lang/pull/340

[LANG-1406]  StringIndexOutOfBoundsException in 
StringUtils.replaceIgnoreCase

Fix for Lang-1406 to avoid any exception while performing 
String.UTils.replaceIgnoreCase() against uniCode String object. 

Plz let me if there have any extra things need to be done for this PR since 
I am the first timer for commons-lang project( add more test case?).

All the best

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HiuKwok/commons-lang master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/commons-lang/pull/340.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #340


commit e0f6c7802b5e721019a602bf30b31c79dbf6d233
Author: Hiu Kwok 
Date:   2018-08-08T11:44:51Z

toUpperCase() > toLowerCase() to avoid unicode string length miscalculation

commit 590f90889bf61a5570bd98b78e73410a07d7410b
Author: Hiu Kwok 
Date:   2018-08-08T11:46:29Z

Assertion for example mentioned on LANG-1406 Description




---


[jira] [Commented] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573137#comment-16573137
 ] 

ASF GitHub Bot commented on LANG-1406:
--

Github user coveralls commented on the issue:

https://github.com/apache/commons-lang/pull/340
  

[![Coverage 
Status](https://coveralls.io/builds/18383289/badge)](https://coveralls.io/builds/18383289)

Coverage decreased (-0.04%) to 95.243% when pulling 
**590f90889bf61a5570bd98b78e73410a07d7410b on HiuKwok:master** into 
**a36c903d4f1065bc59f5e6d2bb0f9d92a5e71d83 on apache:master**.



> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread HiuFung Kwok (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573108#comment-16573108
 ] 

HiuFung Kwok edited comment on LANG-1406 at 8/8/18 12:15 PM:
-

Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String object 
([ref|https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode]
 ), String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.


 


was (Author: hiukwok):
Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String object 
([ref|https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode]
 ), String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.

Fixture: 
[https://github.com/HiuKwok/commons-lang/commit/e0f6c7802b5e721019a602bf30b31c79dbf6d233]

Testcase: 
https://github.com/HiuKwok/commons-lang/commit/590f90889bf61a5570bd98b78e73410a07d7410b

 

 

> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] commons-lang issue #340: [LANG-1406] StringIndexOutOfBoundsException in Stri...

2018-08-08 Thread coveralls
Github user coveralls commented on the issue:

https://github.com/apache/commons-lang/pull/340
  

[![Coverage 
Status](https://coveralls.io/builds/18383289/badge)](https://coveralls.io/builds/18383289)

Coverage decreased (-0.04%) to 95.243% when pulling 
**590f90889bf61a5570bd98b78e73410a07d7410b on HiuKwok:master** into 
**a36c903d4f1065bc59f5e6d2bb0f9d92a5e71d83 on apache:master**.



---


[jira] [Commented] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread HiuFung Kwok (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573108#comment-16573108
 ] 

HiuFung Kwok commented on LANG-1406:


Hi all,

After a bit of research, it seem to be a known issue when unicode is contained 
on a String 
object([ref|[https://www.quora.com/Is-Javas-toLowercase-string-method-reliable-for-Unicode])],
 String.toLowerCase() would produce a incorrect result.

In this case "\u0130" would become a String object with three char which are [ 
i,  ̇, x] instead of [ İ, x].

So by given a incorrect result from .toLowCase() method, 
StringUtils.replaceIgnoreCase end attempt to access the segment of string which 
is not exist which is 3 in this case while str.length() is 2.

The fixture I come up with is replacing the .toLowcase() to .toUpperCase() in 
order to avoid the mis-interprettion on .toLowerCase while performing 
case-insensitive comparisons.

Fixture: 
[https://github.com/HiuKwok/commons-lang/commit/e0f6c7802b5e721019a602bf30b31c79dbf6d233]

Testcase: 
https://github.com/HiuKwok/commons-lang/commit/590f90889bf61a5570bd98b78e73410a07d7410b

 

 

> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (RNG-50) PoissonSampler single use speed improvements

2018-08-08 Thread Gilles (JIRA)


 [ 
https://issues.apache.org/jira/browse/RNG-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilles resolved RNG-50.
---
   Resolution: Implemented
Fix Version/s: 1.1

Improvement implemented as of commit edb3eed76e5a50ddce94dd5510f0c9d2f54be35a 
("master"); discussion and further changes postponed to after the release of 
version 1.1.

> PoissonSampler single use speed improvements
> 
>
> Key: RNG-50
> URL: https://issues.apache.org/jira/browse/RNG-50
> Project: Commons RNG
>  Issue Type: Improvement
>Affects Versions: 1.0
>Reporter: Alex D Herbert
>Priority: Minor
> Fix For: 1.1
>
> Attachments: PoissonSamplerTest.java, jmh-result.csv
>
>
> The Sampler architecture of {{org.apache.commons.rng.sampling.distribution}} 
> is nicely written for fast sampling of small dataset sizes. The constructors 
> for the samplers do not check the input parameters are valid for the 
> respective distributions (in contrast to the old 
> {{org.apache.commons.math3.random.distribution}} classes). I assume this is a 
> design choice for speed. Thus most of the samplers can be used within a loop 
> to sample just one value with very little overhead.
> The {{PoissonSampler}} precomputes log factorial numbers upon construction if 
> the mean is above 40. This is done using the {{InternalUtils.FactorialLog}} 
> class. As of version 1.0 this internal class is currently only used in the 
> {{PoissonSampler}}.
> The cache size is limited to 2*PIVOT (where PIVOT=40). But it creates and 
> precomputes the cache every time a PoissonSampler is constructed if the mean 
> is above the PIVOT value.
> Why not create this once in a static block for the PoissonSampler?
> {code:java}
> /** {@code log(n!)}. */
> private static final FactorialLog factorialLog;
>  
> static 
> {
> factorialLog = FactorialLog.create().withCache((int) (2 * 
> PoissonSampler.PIVOT));
> }
> {code}
> This will make the construction cost of a new {{PoissonSampler}} negligible. 
> If the table is computed dynamically as a static construction method then the 
> overhead will be in the first use. Thus the following call will be much 
> faster:
> {code:java}
> UniformRandomProvider rng = ...;
> int value = new PoissonSampler(rng, 50).sample();
> {code}
> I have tested this modification (see attached file) and the results are:
> {noformat}
> Mean 40  Single construction ( 7330792) vs Loop construction  
> (24334724)   (3.319522.2x faster)
> Mean 40  Single construction ( 7330792) vs Loop construction with static 
> FactorialLog ( 7990656)   (1.090013.2x faster)
> Mean 50  Single construction ( 6390303) vs Loop construction  
> (19389026)   (3.034132.2x faster)
> Mean 50  Single construction ( 6390303) vs Loop construction with static 
> FactorialLog ( 6146556)   (0.961857.2x faster)
> Mean 60  Single construction ( 6041165) vs Loop construction  
> (21337678)   (3.532047.2x faster)
> Mean 60  Single construction ( 6041165) vs Loop construction with static 
> FactorialLog ( 5329129)   (0.882136.2x faster)
> Mean 70  Single construction ( 6064003) vs Loop construction  
> (23963516)   (3.951765.2x faster)
> Mean 70  Single construction ( 6064003) vs Loop construction with static 
> FactorialLog ( 5306081)   (0.875013.2x faster)
> Mean 80  Single construction ( 6064772) vs Loop construction  
> (26381365)   (4.349935.2x faster)
> Mean 80  Single construction ( 6064772) vs Loop construction with static 
> FactorialLog ( 6341274)   (1.045591.2x faster)
> {noformat}
> Thus the speed improvements would be approximately 3-4 fold for single use 
> Poisson sampling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] commons-lang issue #340: [LANG-1406] StringIndexOutOfBoundsException in Stri...

2018-08-08 Thread kinow
Github user kinow commented on the issue:

https://github.com/apache/commons-lang/pull/340
  
I'm surprised by this bug. Had no idea something like this could happen. 
Will debug later and see if I can understand why that happens (might have to 
train my brain to default to always use uppercase instead of lowercase?). 
Thanks for the pull request, we will review the code and if everything looks OK 
a committer will merge it.


---


[jira] [Commented] (LANG-1406) StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase

2018-08-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LANG-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573168#comment-16573168
 ] 

ASF GitHub Bot commented on LANG-1406:
--

Github user kinow commented on the issue:

https://github.com/apache/commons-lang/pull/340
  
I'm surprised by this bug. Had no idea something like this could happen. 
Will debug later and see if I can understand why that happens (might have to 
train my brain to default to always use uppercase instead of lowercase?). 
Thanks for the pull request, we will review the code and if everything looks OK 
a committer will merge it.


> StringIndexOutOfBoundsException in StringUtils.replaceIgnoreCase
> 
>
> Key: LANG-1406
> URL: https://issues.apache.org/jira/browse/LANG-1406
> Project: Commons Lang
>  Issue Type: Bug
>  Components: lang.*
>Reporter: Michael Ryan
>Priority: Major
>
> STEPS TO REPRODUCE:
> {code}
> StringUtils.replaceIgnoreCase("\u0130x", "x", "")
> {code}
> EXPECTED: "\u0130" is returned.
> ACTUAL: StringIndexOutOfBoundsException
> This happens because the replace method is assuming that text.length() == 
> text.toLowerCase().length(), which is not true for certain characters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (POOL-348) The commons-pool-evictor-thread should run as a Deamon

2018-08-08 Thread Gary Gregory (JIRA)


[ 
https://issues.apache.org/jira/browse/POOL-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573179#comment-16573179
 ] 

Gary Gregory commented on POOL-348:
---

Thank you for your report. Feel free to provide a patch. Please make sure all 
unit tests pass.

> The commons-pool-evictor-thread should run as a Deamon
> --
>
> Key: POOL-348
> URL: https://issues.apache.org/jira/browse/POOL-348
> Project: Commons Pool
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Thomas Neerup
>Priority: Major
>
> The thread "commons-pool-evictor-thread" does not run as a Deamon and keeps 
> the JVM alive when all other non Deamon threads has ended.
> Is there any reason for this behaviour? Otherwise the thread should be 
> started as a Deamon.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MATH-1459) Create a way to calculate the Jacobian Matrix using a Differentiator

2018-08-08 Thread adrian (JIRA)


[ 
https://issues.apache.org/jira/browse/MATH-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573591#comment-16573591
 ] 

adrian edited comment on MATH-1459 at 8/8/18 5:55 PM:
--

Hey [~erans] I also made a cleanup pull request (deleting a file that was 
unused and made a pretty minor improvement to the code) here.  Sorry it wasn't 
in my last pull request.

https://github.com/apache/commons-math/pull/86

Thanks!


was (Author: aporter):
Hey [~erans] I also made a cleanup pull request (deleting a file that was 
unused and made a pretty minor improvement to the code) here:

https://github.com/apache/commons-math/pull/86

Thanks!

> Create a way to calculate the Jacobian Matrix using a Differentiator
> 
>
> Key: MATH-1459
> URL: https://issues.apache.org/jira/browse/MATH-1459
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 4.X
>Reporter: adrian
>Priority: Minor
> Fix For: 4.X
>
>
> Create a way to automatically calculate a Jacobian Matrix using a 
> Differentiator.
> I have done this with a pull request, but would like feedback.
> edit:  https://github.com/apache/commons-math/pull/84



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MATH-1459) Create a way to calculate the Jacobian Matrix using a Differentiator

2018-08-08 Thread adrian (JIRA)


[ 
https://issues.apache.org/jira/browse/MATH-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573591#comment-16573591
 ] 

adrian commented on MATH-1459:
--

Hey [~erans] I also made a cleanup pull request (deleting a file that was 
unused and made a pretty minor improvement to the code) here:

https://github.com/apache/commons-math/pull/86

Thanks!

> Create a way to calculate the Jacobian Matrix using a Differentiator
> 
>
> Key: MATH-1459
> URL: https://issues.apache.org/jira/browse/MATH-1459
> Project: Commons Math
>  Issue Type: Improvement
>Affects Versions: 4.X
>Reporter: adrian
>Priority: Minor
> Fix For: 4.X
>
>
> Create a way to automatically calculate a Jacobian Matrix using a 
> Differentiator.
> I have done this with a pull request, but would like feedback.
> edit:  https://github.com/apache/commons-math/pull/84



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)