Re: Specify attachment encoding for couchdb

Alexander Shorin Fri, 03 Jan 2014 07:49:45 -0800

Erhm..just replace:

> db.put_attachment(doc, content, content_type='text/plain')


with

> db.put_attachment(doc, content, content_type='text/plain;charset=utf-8')

And CouchDB will remember it:

$ http HEAD http://localhost:5984/b/testing/test
HTTP/1.1 200 OK
Accept-Ranges: none
Cache-Control: must-revalidate
Content-Encoding: gzip
Content-Length: 102
Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
Content-Type: text/plain; charset=utf-8
Date: Fri, 03 Jan 2014 14:14:27 GMT
ETag: "7y85tiUeF/UX9kqpKAzQEw=="
Server: CouchDB/1.6.0+build.0bf1856 (Erlang OTP/R16B01)

it will also available in attachments stub info. So before decoding,
just read content-type value, get att's encoding and decode it
according it.

--
,,,^..^,,,


On Fri, Jan 3, 2014 at 7:43 PM, Daniel Gonzalez <[email protected]> wrote:
> No, what I mean is "how can I keep track of the encoding used for each of
> the attachments, so that I can decode then correctly whenever I want to"
>
>
> On Fri, Jan 3, 2014 at 4:23 PM, Alexander Shorin <[email protected]> wrote:
>
>> Not sure if I follow your idea. You mean, that how you can set such
>> charset info for existed attachments? In this case you have to
>> reupload them.
>> --
>> ,,,^..^,,,
>>
>>
>> On Fri, Jan 3, 2014 at 6:40 PM, Daniel Gonzalez <[email protected]>
>> wrote:
>> > Thanks but, how do you set that on a per-attachment basis in a couchdb
>> > document? If this is not supported, I guess I will have to add a mapping
>> > "attachments-encoding" to the document where I can associate each
>> > attachment with its encoding. Any comments on this?
>> >
>> >
>> > On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <[email protected]>
>> wrote:
>> >
>> >> You can set MIME type as text/plain;charset=utf-8 to help browsers
>> >> detect the correct content encoding.
>> >> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
>> >> --
>> >> ,,,^..^,,,
>> >>
>> >>
>> >> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <[email protected]>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > I have the following test script:
>> >> >
>> >> > # -*- coding: utf-8 -*-
>> >> >
>> >> > import os
>> >> > import couchdb
>> >> >
>> >> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
>> >> >
>> >> > # Prepare a unicode file, encoded using ENCODING
>> >> > ENCODING = 'utf-8'
>> >> > filename = '/tmp/test'
>> >> > open(filename, 'w').write(GREEK.encode(ENCODING))
>> >> >
>> >> > # Create an empty document
>> >> > server = couchdb.Server()
>> >> > db = server['cdb-tests']
>> >> > doc_id = 'testing'
>> >> > doc = { }
>> >> > db[doc_id] = doc
>> >> >
>> >> > # Attach the file to the document
>> >> > content = open(filename, 'rb') # Open the file for reading
>> >> > db.put_attachment(doc, content, content_type='text/plain')
>> >> >
>> >> > As you can see, the file is utf-8 encoded, but when I attach that
>> file to
>> >> > couchdb, I have no way to specify this encoding. Thus, requesting the
>> >> > attachment at http://localhost:5984/cdb-tests/testing/test returns
>> the
>> >> > following Response Headers:
>> >> >
>> >> > HTTP/1.1 200 OK
>> >> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
>> >> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
>> >> > Date: Fri, 03 Jan 2014 13:43:36 GMT
>> >> > Content-Type: text/plain
>> >> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
>> >> > Content-Length: 102
>> >> > Content-Encoding: gzip
>> >> > Cache-Control: must-revalidate
>> >> > Accept-Ranges: none
>> >> >
>> >> > Seeing the attachment with a browser shows complete gibberish. How
>> can I
>> >> > store the encoding for couchdb attachments?
>> >> >
>> >> > Thanks and regards,
>> >> >
>> >> > Daniel
>> >> >
>> >> > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
>> >>
>>

Re: Specify attachment encoding for couchdb

Reply via email to