[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2021-02-07 Thread Jason R. Coombs


Jason R. Coombs  added the comment:

Today I encountered another situation where it would be convenient to allow an 
ellipsis at the beginning of the syntax:

>>> pathlib.Path('abc')
...Path('abc')

Because pathlib.Path resolves to `PosixPath` and `WindowsPath` depending on the 
platform, it would be nice to match both.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2021-01-16 Thread Jason R. Coombs


Jason R. Coombs  added the comment:

I've encountered this issue again with a different use-case.

I'm attempting to add a doctest to a routine that emits the paths of the files 
it processes. I want to use ellipses to ignore the prefixes of the output 
because they're not pertinent to the test. Here's the test that might have 
worked: 
https://github.com/python/importlib_resources/commit/ca9d014e1b884ff7f8cee63a436832a3e6e809fb,
 but failed with:

```
___ ERROR collecting 
importlib_resources/tests/update-zips.py ___
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:939: 
in find
self._find(tests, obj, name, module, source_lines, globs, {})
.tox/python/lib/python3.9/site-packages/_pytest/doctest.py:522: in _find
doctest.DocTestFinder._find(  # type: ignore
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:1001:
 in _find
self._find(tests, val, valname, module, source_lines,
.tox/python/lib/python3.9/site-packages/_pytest/doctest.py:522: in _find
doctest.DocTestFinder._find(  # type: ignore
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:989: 
in _find
test = self._get_test(obj, name, module, globs, source_lines)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:1073:
 in _get_test
return self._parser.get_doctest(docstring, globs, name,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:675: 
in get_doctest
return DocTest(self.get_examples(string, name), globs,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:689: 
in get_examples
return [x for x in self.parse(string, name)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:651: 
in parse
self._parse_example(m, name, lineno)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:709: 
in _parse_example
self._check_prompt_blank(source_lines, indent, name, lineno)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py:793: 
in _check_prompt_blank
raise ValueError('line %r of the docstring for %s '
E   ValueError: line 6 of the docstring for 
importlib_resources.tests.update-zips.main lacks blank after ...: 
'.../data01/utf-16.file -> ziptestdata/utf-16.file'
```

I was able to work around the issue by injecting a newline into the output 
(https://github.com/python/importlib_resources/commit/b8d48d5a86a9f5bd391c18e1acb39b5697f7ca40).

I notice also that in some environments that the test still fails due to the 
arbitrary ordering of the output, but that test does pass in some environments.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2019-05-13 Thread Michael Blahay


Michael Blahay  added the comment:

At the end of msg309603 it was stated that this issue is being changed to an 
enhancement. Later on, Tim Peters changed it Type back to behavior, but didn't 
provide any detail about why. Should this issue still be considered an 
enhancement?

--
nosy: +mblahay

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-07 Thread Tim Peters

Tim Peters  added the comment:

By the way, going back to your original problem, "the usual" solution to that 
different platforms can list directories in different orders is  simply to sort 
the listing yourself.  That's pretty easy in Python ;-)  Then your test can 
verify the hashes and names of _every_ file of interest - and would be clearer 
on the face of it than anything you could do to try to ignore every line save 
one.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-07 Thread Tim Peters

Tim Peters  added the comment:

Jason, an ellipsis will match an empty string.  But if your expected output is:

"""
x...
abcd
...
"""

you're asking for output that:

- starts with "x"
- followed by 0 or more of anything
- FOLLOWED BY A NEWLINE (I think you're overlooking this part)
- followed by "abcd" and a newline
- followed by 0 or more of anything
- followed by (and ending) with a newline

So, e.g., "xabcd\n" doesn't match - not because of the ellipsis, but because of 
the newline following the first ellipsis.  You can repair that by changing the 
expected output like so:

"""
x...abcd
...
"""

This still requires that "abcd" is _followed_ by a newline, but puts no 
constraints on what appears before it.

In your specific context, it seems you want to say that your expected line has 
to appear _as_ its own line in your output, so that it must appear either at 
the start of the output _or_ immediately following a newline.

Neither ellipses nor a simple string search is sufficient to capture that 
notion.  Fancier code can do it, or a regexp search, or, e.g.,

what_i_want_without_the_trailing_newline in output.splitlines()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-07 Thread Jason R. Coombs

Jason R. Coombs  added the comment:

Thank you Steven for creating a reproduction of the issue; I should have done 
that in the first place. I have the +ELLIPSIS enabled elsewhere in the test 
suite, which is why it didn't appear in my example.

I should clarify - what I thought was a suitable workaround turns out is not, 
in part because the ellipsis must match _something_ and cannot be a degenerate 
match, leading to [this 
failure](https://travis-ci.org/jaraco/jaraco.financial/jobs/325955523). So the 
workaround I thought I'd devised was only suitable in some environments (where 
some content did appear before the target content).

I conclude that trying to match only a single line from a 
non-deterministically-ordered list of lines isn't a function for which ellipsis 
is well suited. I'll be adapting the test to simply test for the presence of 
the expected substring. Therefore, the use-case I presented is invalid (at 
least while ellipsis must match at least one character).

Still, I suspect I haven't been the only person to encounter the reported 
ambiguity, and I appreciate the progress toward addressing it. I like Steven's 
approach, as it's simple and directly addresses the ambiguity. It does have the 
one downside that for the purposes of the documentation, it's a little less 
elegant, as a literal "" appears in the docstring.

Perhaps instead of "ELLIPSIS", the indicator should be "ANYTHING" or similar, 
acting more as a first-class feature rather than a stand-in for an ellipsis. 
That would save the human reader the distraction and trouble of translating 
"" to "..." before interpreting the value (even if that's what the 
doctest interpreter does under the hood).

Alternatively, consider "<...>" as the syntax. I'm liking that because it 
almost looks like it's intention, avoiding much of the distraction. As I think 
about it more, I'm pretty sure such and approach is not viable, as it's a new 
syntax (non-alpha in the directive) and highly likely to conflict with existing 
doctests in the wild.

Another way to think about this problem is that the literal "..." is only 
non-viable when it's the first content in the expected output. Perhaps all 
that's needed is a signal that the output is starting, with something like 
"" or "" or "" or "" or "", a token like 
"" except it's an empty match specifically designed to make the 
transition. Such a token would specifically address the issue at the border of 
the test and the output and would _also_ address the issue if the expected 
output begins with a _literal_ "...". Consider this case:

# --- cut %< ---
import doctest

def print_3_dot():
"""
>>> print_3_dot()
...
"""
print('...')

doctest.run_docstring_examples(print_3_dot, globals())
# --- cut %< ---

In that case, "" may also work, but only because a literal 
substitution is being made. One _might_ be surprised when "" does't 
match anything (when +ELLIPSIS is not enabled).

Overall, I'm now thinking the "" solution is suitable and clear 
enough.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-07 Thread Steven D'Aprano

Steven D'Aprano  added the comment:

Tim Peters said:
> Right, "..." immediately after a ">>>" line is taken to indicate a code 
> continuation line, and there's no way to stop that short of rewriting the 
> parser.


I haven't gone through the source in detail, but it seems to me that we could 
change OutputChecker.check_output to support this without touching the parser.

Ignoring issues of backwards compatibility for the moment, suppose we accept 
either '...' or '' as the wild card in the output section. Jason's 
example would then become:

>>> print(res)  # docstring: +ELLIPSIS

d41d8cd98f00b204e9800998ecf8427e __init__.py
...

check_output could replace the substring '' with three dots before 
doing anything else, and Bob's yer uncle.

Or in this case, Uncle Timmy's yer uncle :-)

There's probably a million details I haven't thought of, but it seems like a 
promising approach to me. I did a quick hack of doctest, adding

want = want.replace('', '...')

to the start of OutputChecker.check_output and it seems to work.

If this is acceptable, we'll probably need a directive to activate it, for the 
sake of backwards compatibility.

Thoughts?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Tim Peters

Tim Peters  added the comment:

And I somehow managed to unsubscribe Steven :-(

--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Tim Peters

Tim Peters  added the comment:

Right, "..." immediately after a ">>>" line is taken to indicate a code 
continuation line, and there's no way to stop that short of rewriting the 
parser.

The workaround you already found could be made more palatable if you weren't 
determined to make it impenetrable ;-)  For example,

"""
>>> print("not an ellipsis\\n" + res) #doctest:+ELLIPSIS
not an ellipsis
...
d41d8cd98f00b204e9800998ecf8427e __init__.py
...
"""

Or if this is a one-off, some suitable variant of this is simple:

"""
>>> "d41d8cd98f00b204e9800998ecf8427e __init__.py" in res
True
"""

I'd prefer that, since it directly says you don't care about anything other 
than that `res` contains a specific substring (in the original way, that has to 
be _inferred_ from the pattern of ellipses).

--
nosy:  -steven.daprano
type: enhancement -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Steven D'Aprano

Steven D'Aprano  added the comment:

Oops, somehow managed to accidentally unsubscribe r.david.murray

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Steven D'Aprano

Steven D'Aprano  added the comment:

Here's a simple demonstration of the issue:


# --- cut %< ---
import doctest

def hash_files():
"""
>>> hash_files()  # doctest: +ELLIPSIS
...
d41d8cd98f00b204e9800998ecf8427e __init__.py
...

"""
print("""\
e1f9390d13c90c7ed601afffd1b9a9f9 records.py
6a116973e8f29c923a08c2be69b11859 ledger.py
d41d8cd98f00b204e9800998ecf8427e __init__.py
b83c8a54d6b71e28ccb556a828e3fa5e qif.py
ac2d598f65b6debe9888aafe51e9570f ofx.py
9f2572f761342d38239a1394f4337165 msmoney.py
""")

doctest.run_docstring_examples(hash_files, globals())

# --- cut %< ---


The documentation does say that output must follow the final >>> or ... 

https://docs.python.org/3/library/doctest.html#how-are-docstring-examples-recognized

so I believe this is expected behaviour and not a bug.

Here is a workaround. Change the doctest to something like this:


>>> print('#', end=''); hash_files()  # doctest: +ELLIPSIS
#...
d41d8cd98f00b204e9800998ecf8427e __init__.py
...



But a more elegant solution would be to add a new directive to tell doctest to 
interpret the ... or >>> as output, not input, or to add a new symbol similar 
to .

I'm changing this to an enhancement request as I think this would be useful.

--
components: +Library (Lib)
nosy: +steven.daprano, tim.peters -r.david.murray
type: behavior -> enhancement
versions: +Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread R. David Murray

R. David Murray  added the comment:

Ah, I see my answer crossed with your post :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread R. David Murray

R. David Murray  added the comment:

What happens if you print a placeholder line first, before your test output?  
I'm not sure it will work, I seem to remember something about an ellipses 
starting a line just not being supported, but it was a long time ago...

So, that doesn't work, maybe do something like res = ['x' + l for l in res] so 
that you can use x...?

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Jason R. Coombs

Jason R. Coombs  added the comment:

I did find [this ugly 
workaround](https://github.com/jaraco/jaraco.financial/commit/9b866ab7117d1cfc26d7cdcec10c63a608662b46):

>>> print('x' + res)
x...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32509] doctest syntax ambiguity between continuation line and ellipsis

2018-01-06 Thread Jason R. Coombs

New submission from Jason R. Coombs :

I'm trying to write a doctest that prints the hash and filename of a directory. 
The input is the test dir, but due to the unordered nature of file systems, the 
doctest checks for one known file:

def hash_files(root):
"""
>>> res = hash_files(Path(__file__).dirname())
Discovering documents
Hashing documents
...
>>> print(res)
...
d41d8cd98f00b204e9800998ecf8427e __init__.py
...
"""

However, this test fails with:

― [doctest] jaraco.financial.records.hash_files 
――
047 
048 >>> res = hash_files(Path(__file__).dirname())
049 Discovering documents
050 Hashing documents
051 ...
052 >>> print(res)
Expected:
d41d8cd98f00b204e9800998ecf8427e __init__.py
...
Got:
e1f9390d13c90c7ed601afffd1b9a9f9 records.py
6a116973e8f29c923a08c2be69b11859 ledger.py
d41d8cd98f00b204e9800998ecf8427e __init__.py
b83c8a54d6b71e28ccb556a828e3fa5e qif.py
ac2d598f65b6debe9888aafe51e9570f ofx.py
9f2572f761342d38239a1394f4337165 msmoney.py




The first ellipsis is interpreted as a degenerate continuation of the input 
line, and it seems it's not possible to have an ellipsis at the beginning of 
the expected input.

Is there any workaround for this issue?

--
messages: 309599
nosy: jason.coombs
priority: normal
severity: normal
status: open
title: doctest syntax ambiguity between continuation line and ellipsis
type: behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com