This is an automated email from the ASF dual-hosted git repository.

gerben pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-annotator.git

commit 1126e1418ced3e3aedfad2256867227a0228b9b2
Merge: c4b5598 8dd8995
Author: Gerben <[email protected]>
AuthorDate: Fri Jan 8 16:13:49 2021 +0100

    Merge pull request #99: Generate less minimal prefixes&suffixes
    
    A TextQuoteSelector can add as much prefix and suffix as desired. Until 
now, we only added prefix and suffix as much as was strictly necessary to 
disambiguate the target from other occurrences of the exact same text in the 
same document. When an annotation should still anchor on a modified version of 
the document, it can be helpful to add a little more context, in order to be 
robust against the ambiguity that would result if after such a modification the 
quoted text appears in more pl [...]
    
    Also, it seems neat to have the prefix and suffix contain whole words 
instead of stopping halfway inside a word. This makes it pleasant to read when 
user interfaces expose the prefix&suffix. Also it makes the implementation 
closer to being compatible with the WICG TextFragments spec (see #60).
    
    This PR thus adds two ways to generate less minimal prefixes&suffixes:
    
        - Round them up to the next whitespace.
        - Optionally add prefix&suffix around a short quote even if it is not
        ambiguous.
    
    I made rounding up to whitespace the default behaviour, while the previous 
behaviour can still be obtained using the option minimalContext. For the 
context around short quotes I would not know what would be a good default 
(might depend on use case and document length?); so I left it at 0 for now, 
i.e. the feature is turned off by default.
    
    This PR also refactors the implementation a bit, reusing the seekers 
instead of creating new ones on every match.
    
    To pass options, I added an options object as the last function parameter. 
I thought we might want to move the scope parameter into this option object 
too, but scope is specific to the DOM implementation, so I’m not sure if that 
is desirable.
    
    I added options for anything that would otherwise feel like we’re 
hardcoding a ‘magic number’, but of course quite some choices on how exactly 
the algorithm works are hardcoded opinions too. I doubted between a few 
variations, but thought this the most straightforward with I hope generally 
sensible results. To be seen in practice, I guess.
    
    I added basic tests for each of the new behaviours. Currently these tests 
are still in the dom package, but should be refactored and moved into the 
selector package as the actual algorithms being tested reside there.

 packages/dom/src/text-quote/describe.ts           |   7 +-
 packages/dom/test/text-quote/describe-cases.ts    | 298 +++++++++++++++++++++-
 packages/dom/test/text-quote/describe.test.ts     |  61 ++++-
 packages/selector/src/text/describe-text-quote.ts | 209 ++++++++++++---
 packages/selector/src/text/seeker.ts              |  39 ++-
 web/demo/index.js                                 |   2 +-
 6 files changed, 552 insertions(+), 64 deletions(-)

Reply via email to