GitHub user rectang opened a pull request:
https://github.com/apache/lucy/pull/47
LUCY-295 Int widths for text/byte sizes
Address -Wconversion warnings relating to string lengths, code point
counts, and binary data lengths.
In addition, minor refactoring of Inversion.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rectang/lucy LUCY-295-text-sizes
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucy/pull/47.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #47
----
commit 2fe08b3686270d2367871b9425039c7b07b51d46
Author: Marvin Humphrey <[email protected]>
Date: 2016-05-02T23:32:31Z
Refactor resizing of Inversion.
commit d6135b56c13257c044853aa5dd8fa16f1478018a
Author: Marvin Humphrey <[email protected]>
Date: 2016-05-02T23:41:36Z
Disallow Token lengths over 2 GB.
commit 080c33acda88cb5d35a8a91541873b906ad38310
Author: Marvin Humphrey <[email protected]>
Date: 2016-05-04T01:59:04Z
Fix end offsets for edge case highlight data.
Under some circumstances (outside the most common code paths), the end
offset for the last token in a field may have been too high, as a result
of counting bytes rather than code points in UTF-8 source data.
However, Highlighter only uses this data for heat mapping; it uses safe
string iteration when actually choosing excerpt boundaries, and cannot
overrun.
commit 619ec102499b0f7cc98006ce0ba71dd7f098f879
Author: Marvin Humphrey <[email protected]>
Date: 2016-05-04T02:17:49Z
Change type to avoid integer promotion confusion.
The compiler promotes `uint8_t` to `int` when performing bitwise
operations, which then gets confusing when you assign to a `size_t`
variable. Avoid the whole mess by using `unsigned` instead of
`uint8_t`.
commit a4b0b3b252f4bf253039756bad02b2fe80077114
Author: Marvin Humphrey <[email protected]>
Date: 2016-05-03T00:08:32Z
Address -Wconversion for string/byte lengths.
For text lengths, unicode code point counts, and sometimes arbitrary
byte lengths: add casts and adress potential overflow issues with
checks.
commit 5ba152510713a98c981e461e4102464a618f3807
Author: Marvin Humphrey <[email protected]>
Date: 2016-05-05T01:45:01Z
Change width of size variables for RawPosting.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---