It's possible the issue is that 64-bit integers are passed as two 32-bit
integers, in which case the fix is to receive/send those properly (maybe
only the low bits are received, for example). Building with LIBRARY_DEBUG=1
or SYSCALL_DEBUG=1 might help here, it will print out each call with
arguments and return value, so you can find which syscall is relevant.

However, it's also possible the issue is that musl uses a 32-bit signed
integer for those syscalls, in which case the syscall interface would need
to be changed.

On Thu, Mar 1, 2018 at 11:24 PM, Sören Balko <soeren.ba...@gmail.com> wrote:

> Hi,
>
> I have run into an issue where our code tries to read very large files
> (>2^31 bytes in size) and is effectively running into what looks like an
> integer overflow issue. What happens is that the int64_t members of stat_t
> ("size") and also the return value of llseek are implicitly down-cast into
> signed ints. Here is what we do to mount our file system (slightly
> simplified for brevity):
>
>          var node = Module.FS.createFile('/', emscriptenPath, null, true,
> true);
>
>     node.node_ops = {
>         getattr: function(ganode) {
>             return {
>                 dev: 1,
>                 ino: ganode.id,
>                 mode: ganode.mode,
>                 nlink: 1,
>                 uid: 0,
>                 gid: 0,
>                 rdev: ganode.rdev,
>                 size: size,  // <-- this is a file size > 2^31
>                 atime: new Date(ganode.timestamp),
>                 mtime: new Date(ganode.timestamp),
>                 ctime: new Date(ganode.timestamp),
>                 blksize: 4096,
>                 blocks: Math.ceil(size / 4096)
>             };
>         }
>     };
>
>     node.stream_ops = {
>         llseek: function(stream, offset, whence) {
>             switch (whence) {
>                 case 0: // SEEK_SET
>                 stream.position = offset;
>                 break;
>                 case 1: // SEEK_CUR
>                 stream.positon += offset;
>                 break;
>                 case 2: // SEEK_END
>                 stream.position = size + offset;
>                 break;
>                 default:
>                 throw new Module.FS.ErrnoError(22); // EINVAL
>             }
>
>             return stream.position; // <-- can be > 2^31
>         },
>         read: function(stream, buffer, heapOffset, numberOfBytes,
> fileOffset) {
>             // ...
>         }
>     };
>
> I suspect that the issue arises from the fact that int64_t has no native
> counterpart in JS and is, hence, downcast in the interface between the
> asm.js and the file system code. Is there a quick fix to address this
> issue? I tried -s PRECISE_I64_MATH=2, but to no avail. Also, I am not
> entirely sure where exactly the precision is lost. I guess, it happens in
> the __syscallXY functions for fstat, lseek (and probably also for the
> arguments passed into read).
>
> One idea I had was to patch the syscalls in a way that I render the
> int64_t values as strings on the heap and pass back the pointer to that
> string inside the stat_t structure and the return value of llseek. These
> strings would then have to be parsed back into int64_t values inside the
> syscalls. Not exactly elegant, but it might work. Or is there a generic
> solution?
>
> Thanks heaps in advance for any suggestions...
>
> Soeren
>
> --
> You received this message because you are subscribed to the Google Groups
> "emscripten-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to emscripten-discuss+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to emscripten-discuss+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to