On Monday, 20 January 2014 at 08:33:09 UTC, ilya-stromberg wrote:
Do you know any library with string encoding/decoding support? I need more encodings than provides `std.encoding`.

I did one that does a little bit more decoding, but no encoding support at all. (I wrote it for my web scraper and email reader so all i cared about was getting it to utf8)

https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/characterencodings.d

auto s = convertToUtf8(your_raw_data, "current_encoding");


if you want something full featured, GNU iconv isn't hard to use from D


import core.stdc.errno;
extern(C) {
        alias void* iconv_t;
iconv_t iconv_open(const char *tocode, const char *fromcode);
        int iconv_close(iconv_t cd);

     pragma(lib, "iconv");

       size_t iconv(iconv_t cd,
                    char **inbuf, size_t *inbytesleft,
                    char **outbuf, size_t *outbytesleft);
}

    auto i = iconv_open("UTF-8", toStringz("CP1252"));
if(i == cast(void*) -1) throw new Exception("iconv open failed");
    scope(exit) iconv_close(i);

    /* get input pointer and length ready */
/* Allocate an output buffer with 4x the size of the input buffer */ // keep the output buffer around as a slice and get a pointer to it for the lib
    auto startingOutputBuffer = new char(content.length * 4];
    char* outputBuffer = startingOutputBuffer.ptr;

    while(inputLength) {
auto ret = iconv(i, &input, &inputLength, &outputBuffer, &outputLength);
        if(ret == -1) {
               // check errno. errno == 84 means wrong charset
        }
    }

// number of bytes remaining in the output buffer is the size here
   // so we do original buffer size minus remaining buffer size
   outputLength = (content.length * 4) - outputLength;

   // then slice it to get the result
string convertedContent = startingOutputBuffer[0 .. outputLength];




Note that iconv i think is GPL licensed.

Reply via email to