thx, I'll try that next time Cheers, Jürgen
Am Sonntag, 10. März 2019 01:39:37 UTC+1 schrieb Floh: > > If you want to do the code-page conversion on the JS side then I think the > easiest option is to call a JS function which directly reads the bytes from > the global HEAPU8 array view (which is the unsigned-byte-view into the > emscripten C heap). > > A pointer on the C side is simply a 32-bit index into the HEAPU8 array. > > I'm doing something similar here: > > > https://github.com/floooh/sokol/blob/441f952b36f67ce446f1f21c22dcc344b7f21ed8/sokol_audio.h#L1261 > > ...except that instead of accessing the unsigned-byte view I'm accessing > the float-view (HEAPF32) to copy audio samples from the emscripten heap > into a WebAudio buffer. > > This way at least you have fewer things that can go wrong, and can be > quite sure that the values you're reading are between 0 and 255 (I'm not > sure why you'd be getting out-of-bounds values from the getValue function, > if this is a bug it might be worth writing an emscripten ticket) > > Cheers, > -Floh. > > On Saturday, 9 March 2019 18:30:39 UTC+1, Juergen Wothke wrote: >> >> I don't see why anyone would want to go back into the stone age and >> fiddle with legacy C code memory management and non existing String support >> when I can handle that stuff easily on the JavaScript side.. provided some >> emscripten API lets me access the respective raw data without fucking it up >> beyond recognition (as Pointer_stringify() or UTF8ToString() do). >> >> As I mentioned above this.Module.getValue(ptr++, 'i8', true); already >> seems to be a suitable API to deal with this scenario (the only problem is >> to find it)! >> >> From what you said the text that is currently in the above docs is >> outdated anyway, see: >> >>> >>> "Strings in JavaScript must be converted to pointers for compiled code >>> – the relevant function is Pointer_stringify(), which given a pointer >>> returns a JavaScript string" >> >> >> >> So when that doc is updated it would be a good idea to add some extra >> info for those people that DON'T HAVE UTF-8 input. >> Explain how to use getValue(), e.g. >> -s EXTRA_EXPORTED_RUNTIME_METHODS="['getValue']" >> >> >> PS: I still don't understand why an "i8" can be > 0xff ! >> >> Cheers, >> Jürgen >> >> >> Am Montag, 4. März 2019 14:44:27 UTC+1 schrieb Floh: >>> >>> AFAIK Pointer_stringify() has been deprecated in favour of a function >>> called UTF8ToString() which takes an UTF8-encoded string in the emscripten >>> HEAP and returns a JS string, maybe the docs haven't been updated yet. But >>> I think (but may be wrong) it's just a renaming, and that >>> Pointer_stringify() could deal with UTF-8 string before already. >>> >>> Since ASCII is a subset of UTF8, this would also works for proper >>> (7-bit) ASCII strings. >>> >>> 8-bit characters with code page encoding is a different topic though, >>> since code pages are pretty much legacy, and completely unknown in the web >>> world I would personally prefer to not have extra code-page-aware string >>> functions in the emscripten API. Instead I would convert the strings on the >>> C side first from a specific code page encoding into generic UTF-8 before >>> handing them over to JS. >>> >>> Cheers, >>> -Floh. >>> >>> On Monday, 4 March 2019 12:43:33 UTC+1, Juergen Wothke wrote: >>>> >>>> I often have the situation (e.g. see >>>> http://www.wothke.ch/playmod/?file=/modules/Ad%20Lib/AMusic/Admiral/mein%20erster%20versuch%20!!!.amd) >>>> >>>> that some legacy C program delivers some char* based String and that >>>> original char buffer may be using all kinds of weird character encoding >>>> schemes (ASCII, codepage 437, whatever..). >>>> >>>> What all these text buffers have in common is that Pointer_stringify is >>>> completely unsuitable to deal with them. And yet Pointer_stringify seems >>>> to >>>> be the >>>> ONLY API properly advertised in the emscripten docs (see >>>> https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html >>>> ). >>>> >>>> Eventhough there actually seem to be undocumented functions available >>>> (like AsciiToString, UTF8ToString, UTF16ToString, etc?) that might >>>> actually be useful - at least in some of those >>>> scenarios - many people are probably unaware that they exist. At one >>>> point I had actually started to base64 encode my texts just so that I >>>> would >>>> be able to retrieve the original uncorrupted data on >>>> the JavaScript side ... which is just riddiculous.. >>>> >>>> The last hack I used for codepage 437 encoded strings looked like this; >>>> >>>> this.codeMap= [ // codepage 437 used by PC DOS and MS-DOS >>>> .... >>>> ]; >>>> >>>> cp437ToString: function(ptr) { // Pointer_stringify replacement: >>>> msdos text to unicode.. >>>> var str = ''; >>>> while (1) { >>>> var ch = this.Module.getValue(ptr++, 'i8', true); >>>> if (!ch) return str; >>>> str += String.fromCharCode(this.codeMap[ch& 0xff]); >>>> } >>>> }, >>>> >>>> >>>> >>>> Either I just missed the relevant docs for emscripten functions that >>>> would be useful in these kinds of scenarios - in which case the docs >>>> should >>>> maybe be impoved. Or if >>>> the functionality is actually not there then I wonder why - since I can >>>> hardly be the only person dealing with this kind of scenario. >>>> >>>> PS: I am also surprised by the Module.getValue(ptr++, 'i8', true); >>>> function: >>>> 'i8' seems to suggest that I should be getting a 8-bit integer and yet the >>>> returned values are sometimes bigger than 0xff! ?? >>>> >>> -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
