Xqt created this task.
Xqt added projects: Pywikibot, Pywikibot-tests.
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper.

TASK DESCRIPTION
  GraalPy: charset detection crashes due to broken stdlib codec (euc_jis_2004)
  
  Description:
  ------------
  
  Under GraalPy, parts of the Pywikibot test suite fail when accessing 
requests.Response.apparent_encoding.
  
  The failure is caused by a broken standard library codec in GraalPy, not by 
Pywikibot itself.
  
  Error:
  ------
  
    LookupError: no such codec is supported.
      File ".../encodings/euc_jis_2004.py", line 10, in <module>
        codec = _codecs_jp.getcodec('euc_jis_2004')
    
    
    Stack trace (shortened):
    
    requests.Response.apparent_encoding
     → charset_normalizer.detect()
       → is_multi_byte_encoding()
         → import encodings.euc_jis_2004
           → LookupError
  
  Root cause:
  -----------
  
  GraalPy ships the module encodings.euc_jis_2004, but the required backend 
_codecs_jp.getcodec('euc_jis_2004') is missing.
  
  Importing the encoding module raises LookupError during charset probing in 
charset_normalizer (used by requests).
  CPython does not exhibit this behavior.
  
  Impact on Pywikibot:
  --------------------
  
  Tests that assume Response.apparent_encoding can always be accessed fail 
under GraalPy, even though this is interpreter-specific.
  
  Possible approaches:
  --------------------
  
  - Skip assertions involving apparent_encoding under GraalPy, or
  - Catch LookupError when accessing it
  
    ____________ CharsetTestCase.test_invalid_charset (charset='utf16') 
____________
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/importlib/__init__.py:211:
 in import_module
        ???
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/site-packages/charset_normalizer/utils.py:259:
 in is_multi_byte_encoding
        importlib.import_module("encodings.{}".format(name)).IncrementalDecoder,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/importlib/__init__.py:126:
 in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ 
    
        #
        # euc_jis_2004.py: Python Unicode Codec for EUC_JIS_2004
        #
        # Written by Hye-Shik Chang <[email protected]>
        #
        
        import _codecs_jp, codecs
        import _multibytecodec as mbc
        
    >   codec = _codecs_jp.getcodec('euc_jis_2004')
        
        class Codec(codecs.Codec):
            encode = codec.encode
            decode = codec.decode
        
        class IncrementalEncoder(mbc.MultibyteIncrementalEncoder,
                                 codecs.IncrementalEncoder):
            codec = codec
        
        class IncrementalDecoder(mbc.MultibyteIncrementalDecoder,
                                 codecs.IncrementalDecoder):
            codec = codec
        
        class StreamReader(Codec, mbc.MultibyteStreamReader, 
codecs.StreamReader):
            codec = codec
        
        class StreamWriter(Codec, mbc.MultibyteStreamWriter, 
codecs.StreamWriter):
            codec = codec
        
        def getregentry():
            return codecs.CodecInfo(
                name='euc_jis_2004',
                encode=Codec().encode,
                decode=Codec().decode,
                incrementalencoder=IncrementalEncoder,
                incrementaldecoder=IncrementalDecoder,
                streamreader=StreamReader,
                streamwriter=StreamWriter,
    E           LookupError: no such codec is supported.
    
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/encodings/euc_jis_2004.py:10:
 LookupError
    __________ CharsetTestCase.test_invalid_charset (charset='win-1251') 
___________
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/importlib/__init__.py:211:
 in import_module
        ???
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/site-packages/charset_normalizer/utils.py:259:
 in is_multi_byte_encoding
        importlib.import_module("encodings.{}".format(name)).IncrementalDecoder,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/importlib/__init__.py:126:
 in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ 
    
        #
        # euc_jis_2004.py: Python Unicode Codec for EUC_JIS_2004
        #
        # Written by Hye-Shik Chang <[email protected]>
        #
        
        import _codecs_jp, codecs
        import _multibytecodec as mbc
        
    >   codec = _codecs_jp.getcodec('euc_jis_2004')
        
        class Codec(codecs.Codec):
            encode = codec.encode
            decode = codec.decode
        
        class IncrementalEncoder(mbc.MultibyteIncrementalEncoder,
                                 codecs.IncrementalEncoder):
            codec = codec
        
        class IncrementalDecoder(mbc.MultibyteIncrementalDecoder,
                                 codecs.IncrementalDecoder):
            codec = codec
        
        class StreamReader(Codec, mbc.MultibyteStreamReader, 
codecs.StreamReader):
            codec = codec
        
        class StreamWriter(Codec, mbc.MultibyteStreamWriter, 
codecs.StreamWriter):
            codec = codec
        
        def getregentry():
            return codecs.CodecInfo(
                name='euc_jis_2004',
                encode=Codec().encode,
                decode=Codec().decode,
                incrementalencoder=IncrementalEncoder,
                incrementaldecoder=IncrementalDecoder,
                streamreader=StreamReader,
                streamwriter=StreamWriter,
    E           LookupError: no such codec is supported.
    
    
/opt/hostedtoolcache/GraalPy/24.1.2/x64/lib/python3.11/encodings/euc_jis_2004.py:10:
 LookupError
    ___________________

TASK DETAIL
  https://phabricator.wikimedia.org/T413766

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

_______________________________________________
pywikibot-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to