http://llvm.org/bugs/show_bug.cgi?id=13602
Hyeon-Bin Jeong <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|INVALID | --- Comment #2 from Hyeon-Bin Jeong <[email protected]> 2012-08-17 15:52:48 CDT --- I think i'm doing exactly what you are saying and what the standard intend. (Am i missing something?) I'm trying to make a codecvt which converts external char type (UTF-8) to internal char32_t type (UTF-32). UTF-8 has 1~4 bytes so it's N:1 conversion. When overflow() is called, it fills 4096 bytes external buffer with char(UTF-8) sequence from FILE object. and then convert them into internal buffer with char32_t(UTF-32) characters by calling in(). __r = __cv_->in(__st_, __extbuf_, __extbufend_, __extbufnext_, this->eback() + __unget_sz, this->egptr(), __inext); But it produce only about 1300 char32_t characters when converting asian characters because most of asian language has 3 bytes character width in UTF-8. So __inext move only a third of buffer size; The problem happens when it calls setg after a few line below. this->setg(this->eback(), this->eback() + __unget_sz, __inext); This line sets (internal) buffer end(i.e. __einp_) to __inext. So after this line, egptr() returns position at one third of the way from __intbuf_ to __intbuf+__ibs_. When next underflow called, It calculate read size __nmemb by egptr() - eback(), so it load only 33% of external buffer! As a result, buffer size keep shrinking on each underflow() call until it's size to be 1 byte. Here is sample code from standard documents and it use intern_buf+ISIZE as buffer end, not egptr(). char extern_buf[XSIZE]; char* extern_end; charT intern_buf[ISIZE]; charT* intern_end; codecvt_base::result r = a_codecvt.in(state, extern_buf, extern_buf+XSIZE, extern_end, intern_buf, intern_buf+ISIZE, intern_end); And one thing more. I think seekpos() should restore the state from position argument. seekoff() save current state into position and return it, so seekpos() need to restore state from it. -- Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ LLVMbugs mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs
