[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-07-03 Thread Devin Jeanpierre

Devin Jeanpierre jeanpierr...@gmail.com added the comment:

Updated patch to newest revision, and to use _tokenize function and includes a 
test case to verify that it ignores the encoding directive during the 
tokenization (and every other) step.

I'll file a tokenize bug separately.

--
Added file: http://bugs.python.org/file22559/comments.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-07-03 Thread Devin Jeanpierre

Devin Jeanpierre jeanpierr...@gmail.com added the comment:

Erp I forgot to run this against the rest of the tests. Disregard, I'll fix it 
up a bit later.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-07-03 Thread Devin Jeanpierre

Devin Jeanpierre jeanpierr...@gmail.com added the comment:

Updated.

--
Added file: http://bugs.python.org/file22562/comments3.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-06-24 Thread Devin Jeanpierre

Devin Jeanpierre jeanpierr...@gmail.com added the comment:

You're right, and good catch. If a doctest starts with a #coding:XXX line, 
this should break.

One option is to replace the call to tokenize.tokenize with a call to 
tokenize._tokenize and pass 'utf-8' as a parameter. Downside: that's a private 
and undocumented API. The alternative is to manually add a coding line that 
specifies UTF-8, so that any coding line in the doctest would be ignored. 

My preferred option would be to add the ability to read unicode to the tokenize 
API, and then use that. I can file a separate ticket if that sounds good, since 
it's probably useful to others too.

One other thing to be worried about -- I'm not sure how doctest would treat 
tests with leading coding:XXX lines. I'd hope it ignores them, if it doesn't 
then this is more complicated and the above stuff wouldn't work.

I'll see if I have the time to play around with this (and add more test cases 
to the patch, correspondingly) this weekend.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-06-24 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

I agree that having a unicode API for tokenize seems to make sense, and that 
would indeed require a separate issue.

That's a good point about doctest not otherwise supporting coding cookies.  
Those only really apply to source files.  So no doctest fragments ought to 
contain coding cookies at the start, so your patch ought to be fine.  But I'm 
not familiar with the doctest internals, so having some tests to prove 
everything is fine would be great.

Your code could use the tokenize sniffer to make sure the fragment reads as 
utf-8 and throw an error otherwise.  But using a unicode interface to tokenize 
would probably be cleaner, since I suspect it would mimic what doctest does 
otherwise (ignore coding cookies).  But I don't *know* the latter, so your 
checking it would be appreciated.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-06-23 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

For the most part the patch looks good to me, too.  My one concern is the 
encoding.  tokenize detects the encoding...is it possible for the doctest 
fragment to be detected to be some encoding other than utf-8?

--
nosy: +benjamin.peterson, r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-06-21 Thread Petri Lehtinen

Petri Lehtinen pe...@digip.org added the comment:

The patch looks good to me. It passes the old doctests tests and adds a new 
test case for what it's fixing.

--
nosy: +petri.lehtinen, tim_one

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-04-22 Thread Devin Jeanpierre

New submission from Devin Jeanpierre jeanpierr...@gmail.com:

From the doctest source:

'Option directives are comments starting with doctest:.  Warning: this may 
give false  positives for string-literals that contain the string #doctest:.  
Eliminating these false positives would require actually parsing the string; 
but we limit them by ignoring any line containing #doctest: that is 
*followed* by a quote mark.'

This isn't a huge deal, but it's a bit annoying. Above being confusing, this is 
in contradiction with the doctest documentation, which states:

'Doctest directives are expressed as a special Python comment following an 
example’s source code'

No mention is made of this corner case where the regexp breaks.

As per the comment in the source, the patched version parses the source using 
the tokenize module, and runs a modified directive regex on all comment tokens 
to find directives.

--
components: Library (Lib)
files: comments.diff
keywords: patch
messages: 134278
nosy: Devin Jeanpierre
priority: normal
severity: normal
status: open
title: Doctest sees directives in strings when it should only see them in 
comments
Added file: http://bugs.python.org/file21757/comments.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11909] Doctest sees directives in strings when it should only see them in comments

2011-04-22 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


--
stage:  - patch review
type:  - feature request
versions: +Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11909
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com