Treora commented on issue #85:
URL: 
https://github.com/apache/incubator-annotator/issues/85#issuecomment-674836567


   @tilgovi alluded to this chunking idea in #75:
   
   > However, I think we should consider going a step further and writing a 
text selector that consumes an iterator that yields chunks rather than 
receiving a full text with the initial call. This interface would be useful for 
streaming scenarios where the whole text may not be available or may be 
extremely large. The chunks themselves could be arrays or strings, and if we 
decide that they are strings we may wish to iterate over their code points.
   
   Just one idea this raised: would it go too far to just ask for a stream of 
characters, that we permit to be nested? A simple experiment:
   
   ```
   function isCharacter(charOrStream) {
     return typeof charOrStream === 'string' && [...charOrStream].length === 1;
   }
   
   function printCharacters(iterable) {
     for (charOrStream of iterable) {
       if (!isCharacter(charOrStream)) {
         printCharacters(charOrStream);
       } else {
         console.log(charOrStream);
       }
     }
   }
   
   printCharacters('bla. ');
   printCharacters(['two', ' chunks. '])
   printCharacters(['chunks', [' could', ' nest.', '', ]]);
   ```
   
   I suppose such an approach may be elegant for a specification or reference 
implementation, but perhaps making a stricter structure could help increase 
performance (iterating through a string’s individual characters might anyhow be 
required when anchoring a TextPositionSelector, but for a TextQuoteSelector it 
could be superfluous?)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to