nikie commented on a change in pull request #15901: URL: https://github.com/apache/beam/pull/15901#discussion_r746358197
########## File path: sdks/python/apache_beam/io/textio.py ########## @@ -362,6 +391,15 @@ def _is_self_overlapping(delimiter): return True return False + def _is_escaped(self, read_buffer, position): + # Returns True if byte at position is preceded with an odd number + # of escapechar bytes or False if preceded by 0 or even escapes + # (the even number means that all the escapes are escaped themselves). + for current_pos in reversed(range(-1, position)): + if read_buffer.data[current_pos:current_pos + 1] != self._escapechar: Review comment: I have updated the code to use an explicit counter. Now it is easier to understand. Should I replace bytes slicing comparison with `if read_buffer.data[current_pos] != self._escapechar[0]`, or let's keep it consistent with `_find_separator_bounds`? What about my suggestion in the first comment to have 2 `_find_separator_bounds` - one for the default case, another for custom delimiter and/or escapechar to avoid extra ifs in the default use case? I can prepare such version to see how it would look, if you are ready to consider such code duplication for performance reasons. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org