Hi Ambrose,
Can you specify the complete command line and the database you are using?
Yes, I fear you have lost 1000 entries for each error.
I'm not sure about the root cause of the problem; maybe you need to specify
some additional parameter to the database URI?
See https://imdbpy.readthedocs.io/en/latest/usage/s3.html for an example.
Another obvious source of information is the logs of the database.
Anything useful there?
Hope this helps,
On Thu, Sep 17, 2020 at 12:21 PM Ambrose Chapel
wrote:
>
> I'm running the s32imdbpy.py script to import the gz files into my SQL
> database.
>
> I'm seeing this error a lot, example, when processing name.basics.tsv.gz:
>
> ERROR::error processing data: 1 entries lost: 'charmap' codec
> can't encode characters in position 0-9: character maps to
>
>
> My database table is set to charset utf8_unicode_ci as per instructions.
>
> I guess my obvious question is how can I prevent this, but also, have I
> really lost 1,000 database entries? Or have I got those 1,000 database
> entries in my database but with some problem unicode characters missing, and
> the message is misleading?
>
> TIA
> ___
> Imdbpy-help mailing list
> Imdbpy-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/imdbpy-help
--
Davide Alberani [PGP KeyID: 0x3845A3D4AC9B61AD]
http://www.mimante.net/
___
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help