On Tue, Jan 14, 2020 at 10:15 AM Lewis Tian <[email protected]> wrote:
> > > On Tuesday, January 14, 2020 at 3:45:09 PM UTC+8, Nadav Har'El wrote: >> >> >> On Tue, Jan 14, 2020 at 9:28 AM Lewis Tian <[email protected]> wrote: >> >>> When I run pagerank on Ubuntu, the code works fine. (the graph data is >>> stored in web-NotreDame.txt, 21M) >>> >> >> 21M isn't very large, it shouldn't present any special problems. We >> probably have a bug that isn't just about file size: >> >> >>> taseikyo@ubuntu:~/Desktop/osv/apps/my-pagerank-test$ ls >>> Makefile module.py pagerank.c usr.manifest web-NotreDame.txt >>> >>> taseikyo@ubuntu:~/Desktop/osv/apps/my-pagerank-test$ make >>> cc -pie -o pagerank pagerank.c >>> >>> taseikyo@ubuntu:~/Desktop/osv/apps/my-pagerank-test$ ll -h >>> total 21M >>> drwxrwxr-x 2 taseikyo taseikyo 4.0K Jan 14 11:59 ./ >>> drwxrwxr-x 128 taseikyo taseikyo 4.0K Jan 14 11:51 ../ >>> -rw-rw-r-- 1 taseikyo taseikyo 110 Jan 14 11:51 Makefile >>> -rw-rw-r-- 1 taseikyo taseikyo 60 Jan 13 11:26 module.py >>> -rwxrwxr-x 1 taseikyo taseikyo 17K Jan 14 11:59 pagerank* >>> -rw-rw-r-- 1 taseikyo taseikyo 5.6K Jan 14 11:51 pagerank.c >>> -rw-rw-r-- 1 taseikyo taseikyo 85 Jan 14 11:53 usr.manifest >>> -rw-rw-r-- 1 taseikyo taseikyo 21M Jan 14 11:54 web-NotreDame.txt >>> >>> taseikyo@ubuntu:~/Desktop/osv/apps/my-pagerank-test$ ./pagerank >>> Graph data: >>> >>> Nodes: 325729, Edges: 1497134 >>> >>> >>> Number of iteration to converge: 52 >>> >>> Final Pagerank values: >>> [0.002066 , 0.000181 , ...] >>> >>> Time spent: 0.896491 seconds. >>> >>> But when I build and run pagerank on osv, it fails to read the graph >>> data (only read part of the graph). >>> >>> taseikyo@ubuntu:~/Desktop/osv$ ./scripts/build image=my-pagerank-test >>> taseikyo@ubuntu:~/Desktop/osv$ ./scripts/run.py >>> OSv v0.54.0-71-g69a0ce39 >>> eth0: 192.168.122.15 >>> Booted up in 338.86 ms >>> Cmdline: /pagerank >>> >>> Graph data: >>> >>> Nodes: 325729, Edges: 1497134 >>> >>> Fail to read data... >>> >>> From: 6 To: 119 >>> >>> Here is part of the code: >>> >>> while (!feof(fp)) { >>> fret = fscanf(fp, "%d%d", &fromnode, &tonode); >>> if (fret == 0) { >>> printf("Fail to read data...\n"); >>> printf("\n From: %d To: %d\n",fromnode, tonode); >>> return -1; >>> } >>> ... >>> } >>> >>> >> Maybe we have a bug in fscanf or in the stdio reading layer? >> Things I'd like you to please check: >> 1. Print a failure already if fret < 2 - since 1 is also a failure. >> 2. On failure, please print ftell(fp) - our position in the file (is it >> the end? something in the middle?). Please also do a fgets() or a short >> fgetc() loop or fread() to read the next available bytes, to try to >> understand by fscanf() failed. Is the reading from the file failing, or is >> the parsing failing? >> >> > Thanks for your advice, > On failure, the output is as follows: > > Graph data: > > Nodes: 325729, Edges: 1497134 > > fret: 1 > position: 1025 > next char: > > fret is 1, fp is in the middle, fgetc or fgets cannot read anything. I > think it should be the former case. > Interesting. Smells like a serious stdio bug that needs to be debugged :-( I think it's not a coincidence that position is 1025, with stdio's BUFSIZ=1024. Just as a completely wild guess, can you please try if the following patch to libc/internal/shgetc.c helps? @@ -22,5 +22,6 @@ else f->shend = f->rend; if (f->rend) f->shcnt += f->rend - f->rpos + 1; + if (f->rpos[-1] != c) f->rpos[-1] = c; return c; } > > >> >> >>> When I use a small graph (4 nodes, 7 edges), it runs normally. >>> >>> taseikyo@ubuntu:~/Desktop/osv$ ./scripts/build image=my-pagerank-test >>> taseikyo@ubuntu:~/Desktop/osv$ ./scripts/run.py >>> OSv v0.54.0-71-g69a0ce39 >>> eth0: 192.168.122.15 >>> Booted up in 356.44 ms >>> Cmdline: /pagerank >>> >>> Graph data: >>> >>> Nodes: 4, Edges: 7 >>> >>> >>> Number of iteration to converge: 41 >>> >>> Final Pagerank values: >>> >>> [0.159913 , 0.144016 , 0.144016 , 0.082809 ] >>> >>> Time spent: 0.693802 seconds. >>> >>> Is osv unable to read large files (bug?) I'll appreciate your help very >>> much! : ) >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "OSv Development" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/osv-dev/50f3a073-e364-484c-803e-570e0dd6530c%40googlegroups.com >>> <https://groups.google.com/d/msgid/osv-dev/50f3a073-e364-484c-803e-570e0dd6530c%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/osv-dev/ae3d8e59-bb13-40a4-bdd0-b03ec89b9ff3%40googlegroups.com > <https://groups.google.com/d/msgid/osv-dev/ae3d8e59-bb13-40a4-bdd0-b03ec89b9ff3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CANEVyjvmMrCfHp7_mGm9RE%3Dix8HVyQbhh2OBf5_PdVVA%3DFxa6Q%40mail.gmail.com.
