Basically here is what I'm proposing (if it works on your machine):
Index: dev-tools/scripts/checkJavadocLinks.py
===================================================================
--- dev-tools/scripts/checkJavadocLinks.py (revision 1382919)
+++ dev-tools/scripts/checkJavadocLinks.py (working copy)
@@ -24,7 +24,7 @@
reAtt = re.compile(r"""(?:\s+([a-z]+)\s*=\s*("[^"]*"|'[^']?'|[^'"\s]+))+""",
re.I)
# Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate
blocks, FFFE, and FFFF. */
-reValidChar =
re.compile("^[\u0009\u000A\u000D\u0020-\uD7FF\uE000-\uFFFD\U00010000-\U0010FFFF]*$")
+reValidChar = re.compile("^[^\u0000-\u0008\u000B-\u000C\u000E-\u001F]*$")
# silly emacs: '
On Mon, Sep 10, 2012 at 11:14 AM, Robert Muir <[email protected]> wrote:
> Hmm this looks my regular expression to look for valid characters (we
> had some javadocs that intended \u0000 and so on but java preprocesses
> these, actually giving us invalid xml).
>
> Can you try removing the supplementary ranges from the regex just as a
> test? I don't really fully understand the state of python's unicode
> support.
>
> On Mon, Sep 10, 2012 at 11:10 AM, Yonik Seeley <[email protected]> wrote:
>> Thanks for fixing that.
>>
>> I'm trying to run javadocs-lint myself, but it's not working:
>>
>> javadocs-lint:
>> [exec] Traceback (most recent call last):
>> [exec] File
>> "/usr/local/bin/../Cellar/python3/3.2/lib/python3.2/functools.py",
>> line 176, in wrapper
>> [exec] result = cache[key]
>> [exec] KeyError: (<class 'str'>, '^[\t\n\r
>> -\ud7ff\ue000-�𐀀-\U0010ffff]*$', 0)
>> [exec]
>> [exec] During handling of the above exception, another exception
>> occurred:
>> [exec]
>> [exec] Traceback (most recent call last):
>> [exec] File
>> "/opt/code/lusolr_clean2/lucene/../dev-tools/scripts/checkJavadocLinks.py",
>> line 27, in <module>
>> [exec] reValidChar =
>> re.compile("^[\u0009\u000A\u000D\u0020-\uD7FF\uE000-\uFFFD\U00010000-\U0010FFFF]*$")
>> [exec] File
>> "/usr/local/bin/../Cellar/python3/3.2/lib/python3.2/re.py", line 206,
>> in compile
>> [exec] return _compile(pattern, flags)
>>
>> Anyone have any pointers?
>>
>> -Yonik
>> http://lucidworks.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
>
>
> --
> lucidworks.com
--
lucidworks.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]