Re: interacting with code...

Juergen Wothke Sun, 10 Mar 2019 12:54:59 -0700

thx, I'll try that next time 

Cheers,
Jürgen


Am Sonntag, 10. März 2019 01:39:37 UTC+1 schrieb Floh:
>
> If you want to do the code-page conversion on the JS side then I think the 
> easiest option is to call a JS function which directly reads the bytes from 
> the global HEAPU8 array view (which is the unsigned-byte-view into the 
> emscripten C heap).
>
> A pointer on the C side is simply a 32-bit index into the HEAPU8 array.
>
> I'm doing something similar here:
>
>
> https://github.com/floooh/sokol/blob/441f952b36f67ce446f1f21c22dcc344b7f21ed8/sokol_audio.h#L1261
>
> ...except that instead of accessing the unsigned-byte view I'm accessing 
> the float-view (HEAPF32) to copy audio samples from the emscripten heap 
> into a WebAudio buffer.
>
> This way at least you have fewer things that can go wrong, and can be 
> quite sure that the values you're reading are between 0 and 255 (I'm not 
> sure why you'd be getting out-of-bounds values from the getValue function, 
> if this is a bug it might be worth writing an emscripten ticket)
>
> Cheers,
> -Floh.
>
> On Saturday, 9 March 2019 18:30:39 UTC+1, Juergen Wothke wrote:
>>
>> I don't see why anyone would want to go back into the stone age and 
>> fiddle with legacy C code memory management and non existing String support 
>> when I can handle that stuff easily on the JavaScript side.. provided some 
>> emscripten API lets me access the respective raw data without fucking it up 
>> beyond recognition (as Pointer_stringify() or UTF8ToString() do).
>>
>> As I mentioned above this.Module.getValue(ptr++, 'i8', true); already 
>> seems to be a suitable API to deal with this scenario (the only problem is 
>> to find it)!
>>
>> From what you said the text that is currently in the above docs is 
>> outdated anyway, see:
>>
>>>
>>> "Strings in JavaScript must be converted to pointers for compiled code 
>>> – the relevant function is Pointer_stringify(), which given a pointer 
>>> returns a JavaScript string"
>>
>>
>>
>> So when that doc is updated it would be a good idea to add some extra 
>> info for those people that DON'T HAVE UTF-8 input.
>> Explain how to use getValue(), e.g.
>>       -s EXTRA_EXPORTED_RUNTIME_METHODS="['getValue']"
>>
>>
>> PS: I still don't understand why an "i8" can be > 0xff !
>>
>> Cheers,
>> Jürgen
>>
>>
>> Am Montag, 4. März 2019 14:44:27 UTC+1 schrieb Floh:
>>>
>>> AFAIK Pointer_stringify() has been deprecated in favour of a function 
>>> called UTF8ToString() which takes an UTF8-encoded string in the emscripten 
>>> HEAP and returns a JS string, maybe the docs haven't been updated yet. But 
>>> I think (but may be wrong) it's just a renaming, and that 
>>> Pointer_stringify() could deal with UTF-8 string before already.
>>>
>>> Since ASCII is a subset of UTF8, this would also works for proper 
>>> (7-bit) ASCII strings.
>>>
>>> 8-bit characters with code page encoding is a different topic though, 
>>> since code pages are pretty much legacy, and completely unknown in the web 
>>> world I would personally prefer to not have extra code-page-aware string 
>>> functions in the emscripten API. Instead I would convert the strings on the 
>>> C side first from a specific code page encoding into generic UTF-8 before 
>>> handing them over to JS.
>>>
>>> Cheers,
>>> -Floh.
>>>
>>> On Monday, 4 March 2019 12:43:33 UTC+1, Juergen Wothke wrote:
>>>>
>>>> I often have the situation (e.g. see 
>>>> http://www.wothke.ch/playmod/?file=/modules/Ad%20Lib/AMusic/Admiral/mein%20erster%20versuch%20!!!.amd)
>>>>  
>>>> that some legacy C program delivers some char* based String and that 
>>>> original char buffer may be using all kinds of weird character encoding 
>>>> schemes (ASCII, codepage 437, whatever..).
>>>>
>>>> What all these text buffers have in common is that Pointer_stringify is 
>>>> completely unsuitable to deal with them. And yet Pointer_stringify seems 
>>>> to 
>>>> be the
>>>> ONLY API properly advertised in the emscripten docs (see 
>>>> https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html
>>>> ).
>>>>
>>>> Eventhough there actually seem to be undocumented functions available 
>>>> (like AsciiToString, UTF8ToString, UTF16ToString, etc?) that might 
>>>> actually be useful - at least in some of those
>>>> scenarios - many people are probably unaware that they exist. At one 
>>>> point I had actually started to base64 encode my texts just so that I 
>>>> would 
>>>> be able to retrieve the original uncorrupted data on 
>>>> the JavaScript side ... which is just riddiculous..
>>>>
>>>> The last hack I used for codepage 437 encoded strings looked like this;
>>>>
>>>>  this.codeMap= [ // codepage 437 used by PC DOS and MS-DOS
>>>>                    ....
>>>>                 ];
>>>>
>>>>  cp437ToString: function(ptr) { // Pointer_stringify replacement: 
>>>> msdos text to unicode.. 
>>>>    var str = '';
>>>>    while (1) {
>>>>  var ch = this.Module.getValue(ptr++, 'i8', true);
>>>>  if (!ch) return str;
>>>>  str += String.fromCharCode(this.codeMap[ch& 0xff]);
>>>>    }
>>>>  },
>>>>
>>>>
>>>>
>>>> Either I just missed the relevant docs for emscripten functions that 
>>>> would be useful in these kinds of scenarios - in which case the docs 
>>>> should 
>>>> maybe be impoved. Or if 
>>>> the functionality is actually not there then I wonder why - since I can 
>>>> hardly be the only person dealing with this kind of scenario.
>>>>
>>>> PS: I am also surprised by the Module.getValue(ptr++, 'i8', true); 
>>>> function: 
>>>> 'i8' seems to suggest that I should be getting a 8-bit integer and yet the 
>>>> returned values are sometimes bigger than 0xff! ??
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: interacting with code...

Reply via email to