[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-14 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-14 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset 7c722e32bf582108680f49983cf01eaed710ddb9 by Serhiy Storchaka in 
branch '3.9':
[3.9] bpo-45461: Fix IncrementalDecoder and StreamReader in the 
"unicode-escape" codec (GH-28939) (GH-28945)
https://github.com/python/cpython/commit/7c722e32bf582108680f49983cf01eaed710ddb9


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-14 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset 0bff4ccbfd3297b0adf690655d3e9ddb0033bc69 by Miss Islington (bot) 
in branch '3.10':
[3.10] bpo-45461: Fix IncrementalDecoder and StreamReader in the 
"unicode-escape" codec (GH-28939) (GH-28943)
https://github.com/python/cpython/commit/0bff4ccbfd3297b0adf690655d3e9ddb0033bc69


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-14 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
pull_requests: +27233
pull_request: https://github.com/python/cpython/pull/28945

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-14 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset c96d1546b11b4c282a7e21737cb1f5d16349656d by Serhiy Storchaka in 
branch 'main':
bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" 
codec (GH-28939)
https://github.com/python/cpython/commit/c96d1546b11b4c282a7e21737cb1f5d16349656d


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-14 Thread miss-islington


Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 5.0 -> 6.0
pull_requests: +27231
pull_request: https://github.com/python/cpython/pull/28943

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
keywords: +patch
pull_requests: +27228
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/28939

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
assignee:  -> serhiy.storchaka
versions: +Python 3.10, Python 3.11, Python 3.9 -Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread STINNER Victor


Change by STINNER Victor :


--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread Matthew Barnett


Matthew Barnett  added the comment:

It can be shortened to this:

buffer = b"a" * 8191 + b"\\r\\n"

with open("bug_csv.csv", "wb") as f:
f.write(buffer)

with open("bug_csv.csv", encoding="unicode_escape", newline="") as f:
f.readline()

To me it looks like it's reading in blocks of 8K and then decoding them,  but 
it isn't correctly handling an escape sequence that happens to cross a block 
boundary.

--
nosy: +mrabarnett

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread Anatoly Myachev


Anatoly Myachev  added the comment:

Hello!

I can reduce it a little.
The buffer shoudln't be decreased, as it seems there is a some kind relation 
with the buffer size for IO operations.

buffer = 
b'col1,col2,col3,col4,col5,col6\\r\\n0,2000-01-01,0,00:00:00,DuBFsyerJU,1809.3924826424557\\r\\n10,2000-01-01,10,01:00:00,AlwGHbVPpB,2853.2392617952996\\r\\n20,2000-01-01,20,02:00:00,TEkGgsYXYz,9933.278931158615\\r\\n30,2000-01-01,30,03:00:00,tfvnynVSfp,8574.917426248916\\r\\n40,2000-01-01,40,04:00:00,YOGjhztMWe,3768.71871233428\\r\\n50,2000-01-01,50,05:00:00,vkTOJSeQmU,6330.252072351792\\r\\n60,2000-01-01,60,06:00:00,LeolDfaGyv,5052.618993456892\\r\\n70,2000-01-01,70,07:00:00,OcyrbYVtyr,4287.371622852719\\r\\n80,2000-01-01,80,08:00:00,VUwDPNhcFV,3589.697826814614\\r\\n90,2000-01-01,90,09:00:00,KOadtzcNyK,4794.158259020925\\r\\n100,2000-01-01,100,10:00:00,rdSOjXJBWC,8826.736894397129\\r\\n110,2000-01-01,110,11:00:00,qzwVBOklhk,8086.105782454443\\r\\n120,2000-01-01,120,12:00:00,UTRlqVfKoD,1012.5061461339624\\r\\n130,2000-01-01,130,13:00:00,wKqEkRhkfw,2511.3137510933934\\r\\n140,2000-01-01,140,14:00:00,LxklWJbgxo,406.7116346419042\\r\\n150,2000-01-01,150,15:00:00,SxmZkdUgHv,84
 
24.978062284761\\r\\n160,2000-01-01,160,16:00:00,nEvzypASGb,9890.252156059063\\r\\n170,2000-01-01,170,17:00:00,xiFkkjoDPB,2728.8359201479675\\r\\n180,2000-01-01,180,18:00:00,boMmgpBXgL,4231.680208002166\\r\\n190,2000-01-01,190,19:00:00,dXLJXWiXZI,7757.44902751916\\r\\n200,2000-01-01,200,20:00:00,PBdjwKoCMD,4915.090357003991\\r\\n210,2000-01-01,210,21:00:00,zGWLALpmoA,359.5243650158153\\r\\n220,2000-01-01,220,22:00:00,CfpZJoOqGZ,704.7990862762942\\r\\n230,2000-01-01,230,23:00:00,DrkxpLhpEN,520.3290677592321\\r\\n240,2000-01-02,240,00:00:00,TDKEBbZAzQ,5218.671660857721\\r\\n250,2000-01-02,250,01:00:00,gULwzvNeWO,4218.66872701774\\r\\n260,2000-01-02,260,02:00:00,ogSyzHWmNY,9026.657391329585\\r\\n270,2000-01-02,270,03:00:00,NetmmthtzN,2027.8312539582244\\r\\n280,2000-01-02,280,04:00:00,PoYiHipTzR,7667.627476518046\\r\\n290,2000-01-02,290,05:00:00,MjHIRGmsoq,4144.001792539834\\r\\n300,2000-01-02,300,06:00:00,qESRSNnNnO,5348.024681284471\\r\\n310,2000-01-02,310,07:00:00,sSIjcXWhLC,3622.46
 
73907599413\\r\\n320,2000-01-02,320,08:00:00,IvjrlljbeB,7500.419388155823\\r\\n330,2000-01-02,330,09:00:00,aVWVRXZjZy,3686.5972529264213\\r\\n340,2000-01-02,340,10:00:00,QKeTjcNlCG,1228.9751449454411\\r\\n350,2000-01-02,350,11:00:00,phEdHCVsbe,4254.15983968718\\r\\n360,2000-01-02,360,12:00:00,ursHJjQxRK,6099.131673115221\\r\\n370,2000-01-02,370,13:00:00,JvjcRlYcYG,1503.3586866746164\\r\\n380,2000-01-02,380,14:00:00,gzCyqHPRRb,7816.898213939008\\r\\n390,2000-01-02,390,15:00:00,lQZmobRwzt,8295.113759829599\\r\\n400,2000-01-02,400,16:00:00,qspiYGfTou,1987.8215069414816\\r\\n410,2000-01-02,410,17:00:00,mcqWMMzomf,15.878728570531964\\r\\n420,2000-01-02,420,18:00:00,fiPsxulpGU,5380.485947841902\\r\\n430,2000-01-02,430,19:00:00,gTAyTkpeez,4720.7159908343565\\r\\n440,2000-01-02,440,20:00:00,hzFbhAPvFX,946.5797295044975\\r\\n450,2000-01-02,450,21:00:00,NYNcYxsyVl,7333.850198973723\\r\\n460,2000-01-02,460,22:00:00,wvgMmIxLzo,7399.341315026157\\r\\n470,2000-01-02,470,23:00:00,bZoyzAGgEC,5464.0
 
53510955946\\r\\n480,2000-01-03,480,00:00:00,jZNaceUYyr,1390.8829937709977\\r\\n490,2000-01-03,490,01:00:00,sbfLgcCpru,9626.900131786555\\r\\n500,2000-01-03,500,02:00:00,MHpAkHfnmV,9406.471079089133\\r\\n510,2000-01-03,510,03:00:00,ENdFBGtRCq,3740.8773019724517\\r\\n520,2000-01-03,520,04:00:00,FzqXhMLHLY,4270.3585910905\\r\\n530,2000-01-03,530,05:00:00,wWinjEGhAj,8548.152649813675\\r\\n540,2000-01-03,540,06:00:00,LcxAImCvxt,4097.693176523874\\r\\n550,2000-01-03,550,07:00:00,sDhzGBYKpt,1673.7466277500146\\r\\n560,2000-01-03,560,08:00:00,jhagjcZhGU,4103.702089490347\\r\\n570,2000-01-03,570,09:00:00,ZIkRwPWyWP,9368.662605679918\\r\\n580,2000-01-03,580,10:00:00,uphgoCQwZY,3321.0096306747137\\r\\n590,2000-01-03,590,11:00:00,jEKaqqScLF,8442.084614664149\\r\\n600,2000-01-03,600,12:00:00,kSIJFBHVnL,4065.19226287942\\r\\n610,2000-01-03,610,13:00:00,YRhoANskYn,5089.668482943252\\r\\n620,2000-01-03,620,14:00:00,SnlwCSdkWf,5738.46737129545\\r\\n630,2000-01-03,630,15:00:00,ANfpLOiJTV,393.7754525
 

[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread STINNER Victor


STINNER Victor  added the comment:

Can you please try write a simpler (shorter) reproducer?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45461] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string

2021-10-13 Thread Anatoly Myachev


New submission from Anatoly Myachev :

Expected behavior - if `read()` function works correctly, then `readline()` 
should also works.

Reproducer in file - just run: `python test.py`.

Traceback (most recent call last):
  File "test.py", line 11, in 
f.readline()
  File 
"C:\Users\amyachev\Miniconda3\envs\modin\lib\encodings\unicode_escape.py", line 
26, in decode
return codecs.unicode_escape_decode(input, self.errors)[0]
UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 
8191: \ at end of string

--
components: Unicode
files: test.py
messages: 403837
nosy: anmyachev, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in 
position 8191: \ at end of string
type: behavior
versions: Python 3.8
Added file: https://bugs.python.org/file50354/test.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com