GitHub user rectang opened a pull request: https://github.com/apache/lucy/pull/47
LUCY-295 Int widths for text/byte sizes Address -Wconversion warnings relating to string lengths, code point counts, and binary data lengths. In addition, minor refactoring of Inversion. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rectang/lucy LUCY-295-text-sizes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucy/pull/47.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #47 ---- commit 2fe08b3686270d2367871b9425039c7b07b51d46 Author: Marvin Humphrey <mar...@rectangular.com> Date: 2016-05-02T23:32:31Z Refactor resizing of Inversion. commit d6135b56c13257c044853aa5dd8fa16f1478018a Author: Marvin Humphrey <mar...@rectangular.com> Date: 2016-05-02T23:41:36Z Disallow Token lengths over 2 GB. commit 080c33acda88cb5d35a8a91541873b906ad38310 Author: Marvin Humphrey <mar...@rectangular.com> Date: 2016-05-04T01:59:04Z Fix end offsets for edge case highlight data. Under some circumstances (outside the most common code paths), the end offset for the last token in a field may have been too high, as a result of counting bytes rather than code points in UTF-8 source data. However, Highlighter only uses this data for heat mapping; it uses safe string iteration when actually choosing excerpt boundaries, and cannot overrun. commit 619ec102499b0f7cc98006ce0ba71dd7f098f879 Author: Marvin Humphrey <mar...@rectangular.com> Date: 2016-05-04T02:17:49Z Change type to avoid integer promotion confusion. The compiler promotes `uint8_t` to `int` when performing bitwise operations, which then gets confusing when you assign to a `size_t` variable. Avoid the whole mess by using `unsigned` instead of `uint8_t`. commit a4b0b3b252f4bf253039756bad02b2fe80077114 Author: Marvin Humphrey <mar...@rectangular.com> Date: 2016-05-03T00:08:32Z Address -Wconversion for string/byte lengths. For text lengths, unicode code point counts, and sometimes arbitrary byte lengths: add casts and adress potential overflow issues with checks. commit 5ba152510713a98c981e461e4102464a618f3807 Author: Marvin Humphrey <mar...@rectangular.com> Date: 2016-05-05T01:45:01Z Change width of size variables for RawPosting. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---