[jira] [Commented] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-18 Thread Vladimir Dolzhenko (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654982#comment-16654982
 ] 

Vladimir Dolzhenko commented on LUCENE-8533:


Patch [^LUCENE-8533_fix_readVInt_javadoc.patch] to fix javadoc is attached.

> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: LUCENE-8533_fix_readVInt_javadoc.patch, readVInt.patch
>
>
> {{readVInt()}} has to return positive numbers (and zero), throw some 
> exception in case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is a signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-18 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8533:
---
Attachment: LUCENE-8533_fix_readVInt_javadoc.patch

> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: LUCENE-8533_fix_readVInt_javadoc.patch, readVInt.patch
>
>
> {{readVInt()}} has to return positive numbers (and zero), throw some 
> exception in case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is a signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-18 Thread Vladimir Dolzhenko (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654909#comment-16654909
 ] 

Vladimir Dolzhenko commented on LUCENE-8533:


Thank Uwe for the details and comments.

There is at least one place where {{writeVInt(-1)}} is used explicitly :

[DirectDocValuesConsumer:149|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectDocValuesConsumer.java#L149]

it has to be addressed before dropping negative support for {{vInt}}

> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Assignee: Uwe Schindler
>Priority: Major
> Attachments: readVInt.patch
>
>
> {{readVInt()}} has to return positive numbers (and zero), throw some 
> exception in case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is a signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-17 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8533:
---
Description: 
{{readVInt()}} has to return positive numbers (and zero), throw some exception 
in case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is a signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}


  was:
{{readVInt()}} has to return positive numbers (and zero), throw some exception 
in case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}



> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
> Attachments: readVInt.patch
>
>
> {{readVInt()}} has to return positive numbers (and zero), throw some 
> exception in case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is a signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-17 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8533:
---
Description: 
{{readVInt()}} has to return positive numbers (and zero), throw some exception 
in case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}


  was:
readVInt has to return positive numbers (and zero), throw some exception in 
case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}



> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
> Attachments: readVInt.patch
>
>
> {{readVInt()}} has to return positive numbers (and zero), throw some 
> exception in case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-16 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8533:
---
Description: 
readVInt has to return positive numbers (and zero), throw some exception in 
case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}}

that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}


  was:
readVInt has to return positive numbers (and zero), throw some exception in 
case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that int is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}}

that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}



> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
> Attachments: readVInt.patch
>
>
> readVInt has to return positive numbers (and zero), throw some exception in 
> case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}}
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-16 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8533:
---
Description: 
readVInt has to return positive numbers (and zero), throw some exception in 
case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}


  was:
readVInt has to return positive numbers (and zero), throw some exception in 
case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that {{int}} is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}}

that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}



> Incorrect readVInt: negative number could be returned
> -
>
> Key: LUCENE-8533
> URL: https://issues.apache.org/jira/browse/LUCENE-8533
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
> Attachments: readVInt.patch
>
>
> readVInt has to return positive numbers (and zero), throw some exception in 
> case of negative numbers.
> While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.
> simplifying 
> [readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
>  up to last readByte (exclusive):
> {code:java}
> int i = ((byte)-1) & 0x7F;
> i |= (((byte)-1) & 0x7F) << 7;
> i |= (((byte)-1) & 0x7F) << 14;
> i |= (((byte)-1) & 0x7F) << 21;
> {code}
> Here {{i = 268435455}} or in binary format is 
> {{___}}
> Keeping in mind that {{int}} is signed type we have only 3 more bits before 
> overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}} - 
> that's max value could be stored in 5th byte to avoid overflow.
> Instead of 
> {code:java}
> i |= (b & 0x0F) << 28;
> if ((b & 0xF0) == 0) return i;
> {code}
> has to be
> {code:java}
> i |= (b & 0x07) << 28;
> if ((b & 0xF8) == 0) return i;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8533) Incorrect readVInt: negative number could be returned

2018-10-16 Thread Vladimir Dolzhenko (JIRA)
Vladimir Dolzhenko created LUCENE-8533:
--

 Summary: Incorrect readVInt: negative number could be returned
 Key: LUCENE-8533
 URL: https://issues.apache.org/jira/browse/LUCENE-8533
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Vladimir Dolzhenko
 Attachments: readVInt.patch

readVInt has to return positive numbers (and zero), throw some exception in 
case of negative numbers.

While for the sequence of bytes {{[-1, -1, -1, -1, 15]}} it returns {{-1}}.

simplifying 
[readVInt|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L113]
 up to last readByte (exclusive):

{code:java}
int i = ((byte)-1) & 0x7F;
i |= (((byte)-1) & 0x7F) << 7;
i |= (((byte)-1) & 0x7F) << 14;
i |= (((byte)-1) & 0x7F) << 21;
{code}

Here {{i = 268435455}} or in binary format is 
{{___}}

Keeping in mind that int is signed type we have only 3 more bits before 
overflow happens or in another words {{(Integer.MAX_VALUE - i) >> 28 == 7}}

that's max value could be stored in 5th byte to avoid overflow.

Instead of 

{code:java}
i |= (b & 0x0F) << 28;
if ((b & 0xF0) == 0) return i;
{code}

has to be

{code:java}
i |= (b & 0x07) << 28;
if ((b & 0xF8) == 0) return i;
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8525) throw more specific exception on data corruption

2018-10-15 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8525:
---
Description: 
DataInput throws generic IOException if data looks odd

[DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]

there are other examples like 
[BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
 
[CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
 and maybe 
[DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]

That leads to some difficulties - see [elasticsearch 
#34322|https://github.com/elastic/elasticsearch/issues/34322]

It would be better if it throws more specific exception.

As a consequence 
[SegmentInfos.readCommit|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L281]
 violates its own contract

{code:java}
/**
   * @throws CorruptIndexException if the index is corrupt
   * @throws IOException if there is a low-level IO error
   */
{code}


  was:
DataInput throws generic IOException if data looks odd

[DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]

there are other examples like 
[BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
 
[CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
 and maybe 
[DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]

That leads to some difficulties - see [elasticsearch 
#34322|https://github.com/elastic/elasticsearch/issues/34322]

It would be better if it throws more specific exception.

As a consequence SegmentInfos.readCommit


> throw more specific exception on data corruption
> 
>
> Key: LUCENE-8525
> URL: https://issues.apache.org/jira/browse/LUCENE-8525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like 
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>  
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
>  and maybe 
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch 
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception.
> As a consequence 
> [SegmentInfos.readCommit|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java#L281]
>  violates its own contract
> {code:java}
> /**
>* @throws CorruptIndexException if the index is corrupt
>* @throws IOException if there is a low-level IO error
>*/
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8525) throw more specific exception on data corruption

2018-10-15 Thread Vladimir Dolzhenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Dolzhenko updated LUCENE-8525:
---
Description: 
DataInput throws generic IOException if data looks odd

[DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]

there are other examples like 
[BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
 
[CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
 and maybe 
[DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]

That leads to some difficulties - see [elasticsearch 
#34322|https://github.com/elastic/elasticsearch/issues/34322]

It would be better if it throws more specific exception.

As a consequence SegmentInfos.readCommit

  was:
DataInput throws generic IOException if data looks odd

[DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]

there are other examples like 
[BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
 
[CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
 and maybe 
[DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]

That leads to some difficulties - see [elasticsearch 
#34322|https://github.com/elastic/elasticsearch/issues/34322]

It would be better if it throws more specific exception like 
CorruptIndexException


> throw more specific exception on data corruption
> 
>
> Key: LUCENE-8525
> URL: https://issues.apache.org/jira/browse/LUCENE-8525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like 
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>  
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
>  and maybe 
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch 
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception.
> As a consequence SegmentInfos.readCommit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Vladimir Dolzhenko (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639669#comment-16639669
 ] 

Vladimir Dolzhenko commented on LUCENE-8525:


My point is throw more specific exception (it could be smth like 
DataCorruptionException) rather plain IOException. 

from a user (like elastic) point of view it is clear what is happened 
underneath - is it a real io exception or data specific one. 

> throw more specific exception on data corruption
> 
>
> Key: LUCENE-8525
> URL: https://issues.apache.org/jira/browse/LUCENE-8525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like 
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>  
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
>  and maybe 
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch 
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception like 
> CorruptIndexException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Vladimir Dolzhenko (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639669#comment-16639669
 ] 

Vladimir Dolzhenko edited comment on LUCENE-8525 at 10/5/18 11:11 AM:
--

My point is to throw more specific exception (it could be smth like 
DataCorruptionException) rather plain IOException. 

from a user (like elastic) point of view it is clear what is happened 
underneath - is it a real io exception or data specific one. 


was (Author: vladimir.dolzhenko):
My point is throw more specific exception (it could be smth like 
DataCorruptionException) rather plain IOException. 

from a user (like elastic) point of view it is clear what is happened 
underneath - is it a real io exception or data specific one. 

> throw more specific exception on data corruption
> 
>
> Key: LUCENE-8525
> URL: https://issues.apache.org/jira/browse/LUCENE-8525
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Vladimir Dolzhenko
>Priority: Major
>
> DataInput throws generic IOException if data looks odd
> [DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]
> there are other examples like 
> [BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
>  
> [CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
>  and maybe 
> [DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]
> That leads to some difficulties - see [elasticsearch 
> #34322|https://github.com/elastic/elasticsearch/issues/34322]
> It would be better if it throws more specific exception like 
> CorruptIndexException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8525) throw more specific exception on data corruption

2018-10-05 Thread Vladimir Dolzhenko (JIRA)
Vladimir Dolzhenko created LUCENE-8525:
--

 Summary: throw more specific exception on data corruption
 Key: LUCENE-8525
 URL: https://issues.apache.org/jira/browse/LUCENE-8525
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Vladimir Dolzhenko


DataInput throws generic IOException if data looks odd

[DataInput:141|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/DataInput.java#L141]

there are other examples like 
[BufferedIndexInput:219|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java#L219],
 
[CompressionMode:226|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java#L226]
 and maybe 
[DocIdsWriter:81|https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/lucene/core/src/java/org/apache/lucene/util/bkd/DocIdsWriter.java#L81]

That leads to some difficulties - see [elasticsearch 
#34322|https://github.com/elastic/elasticsearch/issues/34322]

It would be better if it throws more specific exception like 
CorruptIndexException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org