Hi,

I highly can recommend to use pandas to read csv. It does pretty good job
to guess a lot of things without extra config.

Of course it's one more extra dependency.


pe 24. heinäk. 2020 klo 17.09 Ronaldo Mata <ronaldomat...@gmail.com>
kirjoitti:

> Yes, I will try it. Anythin I will let you know
>
> El mié., 22 de julio de 2020 12:24 p. m., Liu Zheng <
> firstday2...@gmail.com> escribió:
>
>> Hi,
>>
>> Are you sure that the file used for detection is the same as the file
>> opened and decoded and gave you incorrect information?
>>
>> By the way, ascii is a proper subset of utf-8. If chardet said it ascii,
>> decoding it using utf-8 should always work.
>>
>> If your file contains non-ascii UTF-8 bytes, maybe it’s a bug in chardet?
>> You can try it directly, without mixing it with django’s requests first.
>> Make sure you can detect and decode the file locally in a test program.
>> Then put it into the app.
>>
>> If you share the file, i’m also glad to help you try it.
>>
>> On Thu, 23 Jul 2020 at 12:04 AM, Ronaldo Mata <ronaldomat...@gmail.com>
>> wrote:
>>
>>> Hi Kovy, this is not solved. Liu Zheng but using
>>> chardet(request.FILES['file'].read()) return encoding "ascii" is not
>>> correct, I've uploaded a file using utf-7 as encoding for example and the
>>> result is wrog. and then I tried
>>> request.FILES['file'].read().decode('ascii') and not work return bad data.
>>> Example for @ string return "+AEA-" string.
>>>
>>> El mié., 22 jul. 2020 a las 11:16, Kovy Jacob (<kovy.ja...@gmail.com>)
>>> escribió:
>>>
>>>> I’m confused. I don’t know if I can help.
>>>>
>>>> On Jul 22, 2020, at 11:11 AM, Liu Zheng <firstday2...@gmail.com> wrote:
>>>>
>>>> Hi, glad you solved the problem. Yes, both the request.FILES[‘file’]
>>>> and the chardet file handler are binary handlers. Binary handler presents
>>>> the raw data. chardet takes a sequence or raw data and then detect the
>>>> encoding format. With its prediction, if you want to open that puece of
>>>> data in text mode, you can use the .decode(<encoding format>) method of
>>>> bytes object to get a python string.
>>>>
>>>> On Wed, 22 Jul 2020 at 11:04 PM, Kovy Jacob <kovy.ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> That’s probably not the proper answer, but that’s the best I can do.
>>>>> Sorry :-(
>>>>>
>>>>>
>>>>> On Jul 22, 2020, at 10:46 AM, Ronaldo Mata <ronaldomat...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Yes, the problem here is that the files will be loaded by the user, so
>>>>> I don't know what delimiter I will receive. This is not a base command 
>>>>> that
>>>>> I am using, it is the logic that I want to incorporate in a view
>>>>>
>>>>> El mié., 22 jul. 2020 a las 10:43, Kovy Jacob (<kovy.ja...@gmail.com>)
>>>>> escribió:
>>>>>
>>>>>> Ah, so is the problem that you don’t always know what the delimiter
>>>>>> is when you read it? If yes, what is the use case for this? You might not
>>>>>> need a universal solution, maybe just put all the info into a csv 
>>>>>> yourself,
>>>>>> manually.
>>>>>>
>>>>>> On Jul 22, 2020, at 10:39 AM, Ronaldo Mata <ronaldomat...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi Kovy, I'm using csv module, but I need to handle the delimiters of
>>>>>> the files, sometimes you come separated by "," others by ";" and rarely 
>>>>>> by
>>>>>> "|"
>>>>>>
>>>>>> El mié., 22 jul. 2020 a las 10:28, Kovy Jacob (<kovy.ja...@gmail.com>)
>>>>>> escribió:
>>>>>>
>>>>>>> Could you just use the standard python csv module?
>>>>>>>
>>>>>>> On Jul 22, 2020, at 10:25 AM, Ronaldo Mata <ronaldomat...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Liu thank for your answer.
>>>>>>>
>>>>>>> This has been a headache, I am trying to read the file using
>>>>>>> csv.DictReader initially i had an error trying to get the dict keys when
>>>>>>> iterating by rows, and i thought it could be encoding (for this reason i
>>>>>>> wanted to prepare the view to use the correct encoding). for that 
>>>>>>> reason I
>>>>>>> asked my question.
>>>>>>>
>>>>>>> 1) your first approach doesn't work, if i send utf-8 file, chardet
>>>>>>> returns ascii as encoding. it seems request.FILES ['file']. read () 
>>>>>>> returns
>>>>>>> a binary with that encoding.
>>>>>>>
>>>>>>> 2) In the end I realized that the problem was the delimiter of the
>>>>>>> csv but predicting it is another problem.
>>>>>>>
>>>>>>> Anyway, it was a task that I had to do and that was my limitation. I
>>>>>>> think there must be a library that does all this, uploading a csv file 
>>>>>>> is
>>>>>>> common practice in many web apps.
>>>>>>>
>>>>>>> El mar., 21 jul. 2020 a las 13:47, Liu Zheng (<
>>>>>>> firstday2...@gmail.com>) escribió:
>>>>>>>
>>>>>>>> Hi. First of all, I think it's impossible to perfectly detect
>>>>>>>> encoding without further information. See the answer in this SO post:
>>>>>>>> https://stackoverflow.com/questions/436220/how-to-determine-the-encoding-of-text
>>>>>>>>  There
>>>>>>>> are many packages and tools to help detect encoding format, but keep in
>>>>>>>> mind that they are only giving educated guesses. (Most of the time, the
>>>>>>>> guess is correct, but do check the dev page to see whether there are 
>>>>>>>> known
>>>>>>>> issues related to your problem.)
>>>>>>>>
>>>>>>>> Now let's say you have decided to use chardet. Check its doc page
>>>>>>>> for the usage:
>>>>>>>> https://chardet.readthedocs.io/en/latest/usage.html#usage You'll
>>>>>>>> have more than one solutions. Here are some examples:
>>>>>>>>
>>>>>>>> 1. If the files uploaded to your server are all expected to be
>>>>>>>> small csv files (less than a few MB and not many users do it 
>>>>>>>> concurrently),
>>>>>>>> you can do the following:
>>>>>>>>
>>>>>>>> #in the view to handle the uploaded file: (assume file input name
>>>>>>>> is just "file")
>>>>>>>> file_content = request.FILES['file'].read()
>>>>>>>> chardet.detect(file_content)
>>>>>>>>
>>>>>>>> 2. Also, chardet seems to support incremental (line-by-line)
>>>>>>>> detection
>>>>>>>> https://chardet.readthedocs.io/en/latest/usage.html#example-detecting-encoding-incrementally
>>>>>>>>
>>>>>>>> Given this, we can also read from requests.FILES line by line and
>>>>>>>> pass each line to chardet
>>>>>>>>
>>>>>>>> from chardet.universaldetector import UniversalDetector
>>>>>>>>
>>>>>>>> #somewhere in a view function
>>>>>>>> detector = UniversalDetector()
>>>>>>>> file_handle = request.FILES['file']
>>>>>>>> for line in file_handle:
>>>>>>>>     detector.feed(line)
>>>>>>>>     if detector.done: break
>>>>>>>> detector.close()
>>>>>>>> # result available as a dict at detector.result
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tuesday, July 21, 2020 at 7:09:35 AM UTC+8, Ronaldo Mata wrote:
>>>>>>>>>
>>>>>>>>> How to deal with encoding when you try to read a csv file on view.
>>>>>>>>>
>>>>>>>>> I have a view to upload csv file, in this view I read file and
>>>>>>>>> save each row as new record.
>>>>>>>>>
>>>>>>>>> My bug is when I try to upload a csv file with a
>>>>>>>>> differente encoding (not UTF-8)
>>>>>>>>>
>>>>>>>>> how to handle this on django (using request.FILES) I was
>>>>>>>>> researching and I found chardet but I don't know how to pass it a
>>>>>>>>> request.FILES. I need help please.
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "Django users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to django-users+unsubscr...@googlegroups.com.
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/django-users/64307441-0e65-45a2-b917-ece15a4ea729o%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/django-users/64307441-0e65-45a2-b917-ece15a4ea729o%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "Django users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to django-users+unsubscr...@googlegroups.com.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziQuZyb74Wsk%2BnjngUpSccOKCYRM_C%3D7KgGX%2BgV5wRzHwQ%40mail.gmail.com
>>>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziQuZyb74Wsk%2BnjngUpSccOKCYRM_C%3D7KgGX%2BgV5wRzHwQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "Django users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to django-users+unsubscr...@googlegroups.com.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/django-users/91E9FE01-4701-478C-B575-2BD5BA5DCE86%40gmail.com
>>>>>>> <https://groups.google.com/d/msgid/django-users/91E9FE01-4701-478C-B575-2BD5BA5DCE86%40gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Django users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to django-users+unsubscr...@googlegroups.com.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziSjnUSkWgHqb1RzsSHsUURLM9%3DPP0ZNX_zORkp3v-L1%2BQ%40mail.gmail.com
>>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziSjnUSkWgHqb1RzsSHsUURLM9%3DPP0ZNX_zORkp3v-L1%2BQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Django users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to django-users+unsubscr...@googlegroups.com.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/django-users/1471A9A8-8BFD-41B0-9AC4-2EA424F1F989%40gmail.com
>>>>>> <https://groups.google.com/d/msgid/django-users/1471A9A8-8BFD-41B0-9AC4-2EA424F1F989%40gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Django users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to django-users+unsubscr...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziR%3DrkT%3DCHquc%3DOCB1WbmLFdGuJy0CWadM7bMs8-cGGPNw%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziR%3DrkT%3DCHquc%3DOCB1WbmLFdGuJy0CWadM7bMs8-cGGPNw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Django users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to django-users+unsubscr...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/django-users/1DD30686-3E37-4217-AC5A-F865A522F059%40gmail.com
>>>>> <https://groups.google.com/d/msgid/django-users/1DD30686-3E37-4217-AC5A-F865A522F059%40gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Django users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to django-users+unsubscr...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/django-users/CAGQ3pf-hZFLu6JpfTg7qj0jJ92v5br38z9Dx2m%3DkKwouiZZhFw%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/django-users/CAGQ3pf-hZFLu6JpfTg7qj0jJ92v5br38z9Dx2m%3DkKwouiZZhFw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Django users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to django-users+unsubscr...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/django-users/73558DAD-CAE6-4275-A8F0-F3A7C47E1514%40gmail.com
>>>> <https://groups.google.com/d/msgid/django-users/73558DAD-CAE6-4275-A8F0-F3A7C47E1514%40gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Django users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to django-users+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziSHnZFKiXON8b5Jn7hu7LVX-jHCOQ%2BHUSeiBO%3DF3Q_yxw%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziSHnZFKiXON8b5Jn7hu7LVX-jHCOQ%2BHUSeiBO%3DF3Q_yxw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Django users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to django-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-users/CAGQ3pf-CsurYvoDYJvbqW9kTMQGMcu5XdJ2zJsp3zz5ZwFvT5g%40mail.gmail.com
>> <https://groups.google.com/d/msgid/django-users/CAGQ3pf-CsurYvoDYJvbqW9kTMQGMcu5XdJ2zJsp3zz5ZwFvT5g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/CAP%3DoziTNYmh37hvx0fJL0n5cK_4HBm3fBi5BZf%3D0cnrG3pzvmw%40mail.gmail.com
> <https://groups.google.com/d/msgid/django-users/CAP%3DoziTNYmh37hvx0fJL0n5cK_4HBm3fBi5BZf%3D0cnrG3pzvmw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CAHn91offCbz%3DH_QH%3D60wpVVM6xHFPnSj4oFg4ZMOso5PS5SfzA%40mail.gmail.com.

Reply via email to