On 29/09/2021 13.10, hongy...@gmail.com wrote:
On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote:
On 29/09/2021 10.22, hongy...@gmail.com wrote:
I tried to convert a xls file into csv with the following command, but failed:

$ in2csv --sheet 'Sheet1' 2021-2022-1.xls
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
b'\r\n\r\n\r\n\r\n'

The above testing file is located at here [1].

[1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls

Any hints for fixing this problem?
You need to delete the 13 first lines in the file

Yes. After deleting the top 3 lines, the problem has been fixed.

or you see to that your code does first trim the data before start xml parse it.

Yes. I really want to do this trick programmatically, but how do I do it 
without manually editing the file?


You could do something like loading the XML into a string (myxmlstr) and then find the fist < in that string

xmlstart = myxmlstr.find('<')

xmlstr = myxmlstr[xmlstart:]

then use the xmlstr in the xml parser, sure not as convenient as loading the file directly to the xml parser.

I don't say this is the best way of doing it, I'm sure some python wiz here would have a smarter solution.

--

 //Aho

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to