The points you mention are part of the reason I chose to wrote the csv lexer the way I did. It follows one of the fastests Haskell csv parsers, and I was curious to see how using linear types could optimize performance.
Regarding your suggestion on how to make better use of $ldelay in my code: I'm stuck on a compiler error that I can't make sense of. The following pseudo-minimal example throws the same kind of errors: #include "share/atspre_define.hats" #include "share/atspre_staload.hats" staload UN = "prelude/SATS/unsafe.sats" staload SBF = "libats/SATS/stringbuf.sats" staload _(*SBF*) = "libats/DATS/stringbuf.dats" datatype DT = D_T of @{ alpha = char } vtypedef llstring = stream_vt(char) fun test (acc: !$SBF.stringbuf, cs: llstring): stream_vt(DT) = $ldelay ( case !cs of | ~stream_vt_nil() => if $SBF.stringbuf_get_size(acc) = i2sz(0) then stream_vt_nil() else stream_vt_cons(D_T(@{alpha = 'a'}), stream_vt_make_nil()) | ~stream_vt_cons(c, cs1) => let val crec = D_T(@{alpha = c}) in stream_vt_cons(crec, test(acc, cs1)) end , ~cs ) The compiler can not infer the type I want (which is [stream_vt_con(DT)] for the [stream_vt_nil()] following the first [then] in the function body. The error message says the dynamic expression cannot be assigned the type [S2EVar(5492)]. [...] mismatch of sorts in unification: The sort of variable is: S2RTbas(S2RTBASimp(1; t@ype)) The sort of solution is: S2RTbas(S2RTBASimp(2; viewtype)) [...] mismatch of static terms (tyleq): The actual term is: S2Eapp(S2Ecst(stream_vt_con); S2EVar(5495)) The needed term is: S2EVar(5492) (There are further errors of the same form.) Is the culprit that [stream_vt] of a nonlinear datatype requires some special care? The version with [stream_vt_make_nil()] instead of explicit [$ldelay] works so the error ought to be subtle. Best wishes, August Den söndag 5 mars 2017 kl. 23:58:35 UTC+1 skrev gmhwxi: > > Yes, you definitely got it :) > > Stream_vt is very memory-frugal. > > Haskell relies on deforestation (complex complier optimization) > to reduce memory usage of lazy evaluation. In ATS, deforestation is > not supported. Instead, the programmer needs to recycle memory explicitly. > > Compared to Haskell, corresponding code using stream_vt in ATS can be > much more efficient both time-wise and memory-wise. > > For instance, the following example (for computing Mersenne primes) can > run for days without run-time GC: > > > https://github.com/githwxi/ATS-Postiats/blob/master/doc/EXAMPLE/RosettaCode/Lucas-Lehmer_test2.dats > > It convincingly attests to the power of linear streams. > > Cheers! > > > On Sun, Mar 5, 2017 at 5:34 PM, August Alm <augu...@gmail.com > <javascript:>> wrote: > >> Thanks for the tip! I think I understand. I treated $ldelay much as a >> data constructor, so that all streams are equally lazy, whereas there are >> in fact many ways to sequence into thunks. Let me give an example to anchor >> the discussion. Both the following implementations of a map-template for >> linear streams typecheck: >> >> fun {a, b: t0ype} >> map_make_cons >> ( xs: stream_vt(a) >> , f: a -> b >> ) : stream_vt(b) = >> case !xs of >> | ~stream_vt_nil() => stream_vt_make_nil() >> | ~stream_vt_cons(x, xs1) => >> stream_vt_make_cons(f(x), map_make_cons(xs1, f)) >> >> fun {a, b: t0ype} >> map_ldelay >> ( xs: stream_vt(a) >> , f: a -> b >> ) : stream_vt(b) = >> $ldelay >> ( case !xs of >> | ~stream_vt_nil() => stream_vt_nil() >> | ~stream_vt_cons(x, xs1) => >> stream_vt_cons(f(x), map_ldelay(xs1, f)) >> , ~xs >> ) >> >> The second is maximally lazy. The first, [map_make_cons] is less lazy >> because checking the case-conditions is not delayed. My code was like the >> first example, only much more was going on inside the case expressions. Is >> that a correct assessment? >> >> >> Den söndag 5 mars 2017 kl. 04:07:42 UTC+1 skrev gmhwxi: >>> >>> BTW, it seems you don't need to do much to fix the issue. >>> >>> Basically, you just do >>> >>> 1) Put the body of parse_entry into $ldelay(...) >>> 2) Change stream_vt_make_cons into stream_vt_cons >>> >>> There may be a few other things but they should all be >>> very minor. >>> >>> On Saturday, March 4, 2017 at 9:47:07 PM UTC-5, gmhwxi wrote: >>>> >>>> I took a glance at your code. >>>> >>>> I noticed a very common mistake involving the use of >>>> stream (or stream_vt). Basically, the way stream is used >>>> in your code is like the way list is used. This causes the >>>> stack issue you encountered. >>>> >>>> Say that you have a function that returns a stream. In nearly >>>> all cases, the correct way to implement such a function should >>>> use the following style: >>>> >>>> fun foo(...): stream_vt(...) = $ldelay >>>> ( >>>> ... >>>> ) >>>> >>>> The idea is that 'foo' should return in O(1) time. The body of $ldelay >>>> is only evaluated with the first element of the returned stream is >>>> neede. >>>> Sometimes, this is call full laziness. Without full laziness, a stream >>>> may >>>> behave like a list, defeating the very purpose of using a stream. >>>> >>>> On Saturday, March 4, 2017 at 7:27:03 PM UTC-5, August Alm wrote: >>>>> >>>>> I've spent few hours trying to figure out how to make proper use of >>>>> npm and gave up--for now. If the project turns into something more >>>>> serious >>>>> (i.e., useful to others) then I will have another go at it. For now my >>>>> naive attempts at making effective use of linear streams can be witnessed >>>>> at GitHub: https://github.com/August-Alm/ats_csv_lexer Any and all >>>>> comments on how to improve are appreciated. >>>>> >>>>> Best wishes, August. >>>>> >>>>> Den fredag 3 mars 2017 kl. 23:57:54 UTC+1 skrev gmhwxi: >>>>>> >>>>>> One possibility is to build a npm package and then publish it. >>>>>> >>>>>> If you go to https://www.npmjs.com/ and seach for 'atscntrb'. You >>>>>> can find >>>>>> plenty packages. You may need to install npm first. >>>>>> >>>>>> If you do build a npm package, I suggest that you choose a name space >>>>>> for >>>>>> yourself. E.g., atscntrb-a?a-..., where ? is the first letter of your >>>>>> middle name. >>>>>> >>>>>> On Fri, Mar 3, 2017 at 5:48 PM, August Alm <augu...@gmail.com> wrote: >>>>>> >>>>>>> How would I best share larger code portions? I have no concerns >>>>>>> about my making my mistakes public, heh. >>>>>>> >>>>>>> I believe everything is lazy as-is (all data is >>>>>>> [stream_vt("sometype")]). And I've tried to write tail-recursive >>>>>>> functional >>>>>>> code. The algorithm is based on two mutually recursing functions, "fun >>>>>>> ... >>>>>>> and ..", similar to how you did things in your csv-parser (thanks for >>>>>>> pointing out that piece of code). However, I cannot set them up with >>>>>>> "fn* >>>>>>> .. and .." to enforce a local jump because they call each other in a >>>>>>> too >>>>>>> intertwined way. Might that be it? >>>>>>> >>>>>>> >>>>>>> Den fredag 3 mars 2017 kl. 23:32:15 UTC+1 skrev gmhwxi: >>>>>>>> >>>>>>>> You are welcome! >>>>>>>> >>>>>>>> Since I have not seen your code, I could only guess :) >>>>>>>> >>>>>>>> Usually, what you described can be fixed by using tail-recursion, or >>>>>>>> by using lazy-evaluation. The former approach is straightforward. >>>>>>>> You >>>>>>>> just need to identify the function or functions that cause the deep >>>>>>>> stack >>>>>>>> usage. Then try to rewrite using tail-recursion. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Mar 3, 2017 at 5:25 PM, August Alm <augu...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi! >>>>>>>>> I had indeed made a logical error that caused any stream with >>>>>>>>> "carriage return" followed by "newline" to recurse indefinitely. >>>>>>>>> Thank you >>>>>>>>> for your patience and pedagogical instincts, Professor! There is >>>>>>>>> still some >>>>>>>>> issue though, one that I believe is more subtle. I fixed the logical >>>>>>>>> error >>>>>>>>> and my algorithm now handles all the test cases you suggested. >>>>>>>>> However, >>>>>>>>> when fed an actual CSV-file with a thousand rows and about 300 >>>>>>>>> columns it >>>>>>>>> still segfaults--unless I manually increase the stack space on my >>>>>>>>> computer! >>>>>>>>> I don't know exactly where the critical limit is, but increasing it >>>>>>>>> from >>>>>>>>> 8192 kbytes to 65536 certainly did the trick. The whole file parsed >>>>>>>>> without >>>>>>>>> problem, and rather quickly at that. It seems my algorithm makes too >>>>>>>>> much >>>>>>>>> use of stack allocation and that I may have to rethink some of my >>>>>>>>> (would-be) optimization choices. >>>>>>>>> Best wishes, >>>>>>>>> August >>>>>>>>> >>>>>>>>> Den fredag 3 mars 2017 kl. 15:22:00 UTC+1 skrev gmhwxi: >>>>>>>>>> >>>>>>>>>> Now you may do the following tests: >>>>>>>>>> >>>>>>>>>> Try: >>>>>>>>>> >>>>>>>>>> val ins = streamize_string_char("a;b") // should work >>>>>>>>>> >>>>>>>>>> Try: >>>>>>>>>> >>>>>>>>>> val ins = streamize_string_char("a;b\n") // may not work >>>>>>>>>> >>>>>>>>>> Try: >>>>>>>>>> >>>>>>>>>> val ins = streamize_string_char("a;b\015\012") // should cause >>>>>>>>>> crash >>>>>>>>>> >>>>>>>>>> On Thursday, March 2, 2017 at 9:21:21 PM UTC-5, gmhwxi wrote: >>>>>>>>>>> >>>>>>>>>>> When tried, I saw the following 5 chars (ascii) in small.csv: >>>>>>>>>>> >>>>>>>>>>> 97 >>>>>>>>>>> 59 >>>>>>>>>>> 98 >>>>>>>>>>> 13 >>>>>>>>>>> 10 >>>>>>>>>>> >>>>>>>>>>> My testing code: >>>>>>>>>>> >>>>>>>>>>> #include"share/atspre_staload.hats" >>>>>>>>>>> #include"share/HATS/atspre_staload_libats_ML.hats" >>>>>>>>>>> >>>>>>>>>>> implement main0 () = { >>>>>>>>>>> val inp = fileref_open_exn("small.csv", file_mode_r) >>>>>>>>>>> val ins = streamize_fileref_char(inp) >>>>>>>>>>> val ins = stream2list_vt(ins) >>>>>>>>>>> val ins = g0ofg1(list_vt2t(ins))97 >>>>>>>>>>> val ( ) = println! ("length(ins) = ", length(ins)) >>>>>>>>>>> val ( ) = (ins).foreach()(lam c => println!(char2int0(c))) >>>>>>>>>>> (* >>>>>>>>>>> val lexed = lex_csv(true, ';', ins) >>>>>>>>>>> *) >>>>>>>>>>> val () = fileref_close(inp) >>>>>>>>>>> (* >>>>>>>>>>> val h = (lexed.head()) >>>>>>>>>>> val- CSV_Field(r) = h >>>>>>>>>>> val a = r.csvFieldContent >>>>>>>>>>> val () = println!(a) >>>>>>>>>>> *) >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Mar 2, 2017 at 9:13 PM, August Alm <...> wrote: >>>>>>>>>>> >>>>>>>>>>>> Just "a;b", or? (Attached.) >>>>>>>>>>>> >>>>>>>>>>>> Den fredag 3 mars 2017 kl. 03:03:08 UTC+1 skrev gmhwxi: >>>>>>>>>>>>> >>>>>>>>>>>>> I suspect that the file you used contains other characters. >>>>>>>>>>>>> >>>>>>>>>>>>> What is in "small.csv"? >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Mar 2, 2017 at 8:52 PM, August Alm <...> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> The file compiles (I've tried a few compiler options) and >>>>>>>>>>>>>> "gdb run" yields >>>>>>>>>>>>>> >>>>>>>>>>>>>> Program received signal SIGSEGV, Segmentation fault. >>>>>>>>>>>>>> 0x00007ffff783eea5 in _int_malloc (av=0x7ffff7b6a620 >>>>>>>>>>>>>> <main_arena>, bytes=16) at malloc.c:3790 >>>>>>>>>>>>>> >>>>>>>>>>>>>> The frames 0-3 involve allocation functions that are not >>>>>>>>>>>>>> particular to my file. Frame 4 says: >>>>>>>>>>>>>> >>>>>>>>>>>>>> #4 __patsfun_28__28__14 (arg0=<optimized out>, >>>>>>>>>>>>>> env1=0x605540, env0=10 '\n') at csv_lexer_dats.c:9023 >>>>>>>>>>>>>> 9023 ATSINSmove_con1_new(tmpret63__14, >>>>>>>>>>>>>> postiats_tysum_7) ; >>>>>>>>>>>>>> >>>>>>>>>>>>>> My not-so-educated guess is that this refers to making a >>>>>>>>>>>>>> cons-cell of a stream. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But: How can my function do just fine when manually fed >>>>>>>>>>>>>> >>>>>>>>>>>>>> cons('a', cons( ';', sing('b'))): stream_vt(char), >>>>>>>>>>>>>> >>>>>>>>>>>>>> but segfault when I use [streamize_fileref_char] to construct >>>>>>>>>>>>>> the very same stream from the string "a;b" in a file? Where is >>>>>>>>>>>>>> the room for >>>>>>>>>>>>>> an infinite recursion in that? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>> August >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Den torsdag 2 mars 2017 kl. 23:04:35 UTC+1 skrev August Alm: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm in over my head and tried writing a CSV-parser using >>>>>>>>>>>>>>> linear lazy streams. My code thus far is 600 lines and almost >>>>>>>>>>>>>>> to my own >>>>>>>>>>>>>>> surprise I get it to compile! However, there is something fishy >>>>>>>>>>>>>>> because I >>>>>>>>>>>>>>> get a segfault when applying my program to an actual CSV-file. >>>>>>>>>>>>>>> I've been >>>>>>>>>>>>>>> trying to debug using gdb but the fault eludes me. Since I >>>>>>>>>>>>>>> don't expect >>>>>>>>>>>>>>> anyone to mull through 600 lines of code, I am hoping these >>>>>>>>>>>>>>> code snippets >>>>>>>>>>>>>>> are enough for one of you guys to give me some advice. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This code executes just fine: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> implement main0 () = { >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> val test = stream_vt_make_cons( >>>>>>>>>>>>>>> 'a', stream_vt_make_cons( >>>>>>>>>>>>>>> ';', >>>>>>>>>>>>>>> stream_vt_make_sing('b'))) (* the stream ('a', ';', >>>>>>>>>>>>>>> 'b') *) >>>>>>>>>>>>>>> val lexed = lex_csv(true, ';', test) >>>>>>>>>>>>>>> val h = (lexed.head()) >>>>>>>>>>>>>>> val- CSV_Field(r) = h >>>>>>>>>>>>>>> val a = r.csvFieldContent >>>>>>>>>>>>>>> val () = println!(a) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Here [lex_csv] is my 600-line alogrithm. It reads a >>>>>>>>>>>>>>> [stream_vt(char)] and gives back a [stream_vt(CSVEntry)], where >>>>>>>>>>>>>>> [CSVEntry] >>>>>>>>>>>>>>> is a record type, one of whose fields is [CSVFieldContent]. >>>>>>>>>>>>>>> When executing >>>>>>>>>>>>>>> the program I get "a" printed to the console. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This code results in a segfault: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> implement main0 () = { >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> val inp = fileref_open_exn("small.csv", >>>>>>>>>>>>>>> file_mode_r) >>>>>>>>>>>>>>> val ins = streamize_fileref_char(inp) >>>>>>>>>>>>>>> val lexed = lex_csv(true, ';', ins) >>>>>>>>>>>>>>> val () = fileref_close(inp) >>>>>>>>>>>>>>> val h = (lexed.head()) >>>>>>>>>>>>>>> val- CSV_Field(r) = h >>>>>>>>>>>>>>> val a = r.csvFieldContent >>>>>>>>>>>>>>> val () = println!(a) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The file "small.csv" only contains the string "a;b". Hence I >>>>>>>>>>>>>>> would expect this code to give the result as the previous one! >>>>>>>>>>>>>>> But, it >>>>>>>>>>>>>>> doesn't just return something else, it segfaults. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> gdb indicates there is a malloc problem having to do with >>>>>>>>>>>>>>> "GC_clear_stack_inner", in case that's helpful. (I'm a >>>>>>>>>>>>>>> mathematician who >>>>>>>>>>>>>>> recently left academia after postdoc and decided to teach >>>>>>>>>>>>>>> myself >>>>>>>>>>>>>>> programming to become more useful outside of academia; hence I >>>>>>>>>>>>>>> understand >>>>>>>>>>>>>>> type systems and the like--the mathy stuff--a lot better than I >>>>>>>>>>>>>>> understand >>>>>>>>>>>>>>> memory allocation and other stuff that most programmers are >>>>>>>>>>>>>>> supposed to be >>>>>>>>>>>>>>> confident with.) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> What could be the problem here? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best wishes, >>>>>>>>>>>>>>> August >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>>> Google Groups "ats-lang-users" group. >>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>>>> it, send an email to ats-lang-user...@googlegroups.com. >>>>>>>>>>>>>> To post to this group, send email to >>>>>>>>>>>>>> ats-lan...@googlegroups.com. >>>>>>>>>>>>>> Visit this group at >>>>>>>>>>>>>> https://groups.google.com/group/ats-lang-users. >>>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>>> https://groups.google.com/d/msgid/ats-lang-users/69535c5c-eac3-472c-bb39-062ad4708a72%40googlegroups.com >>>>>>>>>>>>>> >>>>>>>>>>>>>> <https://groups.google.com/d/msgid/ats-lang-users/69535c5c-eac3-472c-bb39-062ad4708a72%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>>> . >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>> Google Groups "ats-lang-users" group. >>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>> it, send an email to ats-lang-user...@googlegroups.com. >>>>>>>>>>>> To post to this group, send email to >>>>>>>>>>>> ats-lan...@googlegroups.com. >>>>>>>>>>>> Visit this group at >>>>>>>>>>>> https://groups.google.com/group/ats-lang-users. >>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>> https://groups.google.com/d/msgid/ats-lang-users/e608c7bb-42ce-457b-a606-9fe3525f801d%40googlegroups.com >>>>>>>>>>>> >>>>>>>>>>>> <https://groups.google.com/d/msgid/ats-lang-users/e608c7bb-42ce-457b-a606-9fe3525f801d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>> . >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "ats-lang-users" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to ats-lang-user...@googlegroups.com. >>>>>>>>> To post to this group, send email to ats-lan...@googlegroups.com. >>>>>>>>> Visit this group at https://groups.google.com/group/ats-lang-users >>>>>>>>> . >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/ats-lang-users/34dfad01-9bd4-464f-9ccd-6dfae8207f4c%40googlegroups.com >>>>>>>>> >>>>>>>>> <https://groups.google.com/d/msgid/ats-lang-users/34dfad01-9bd4-464f-9ccd-6dfae8207f4c%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "ats-lang-users" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to ats-lang-user...@googlegroups.com. >>>>>>> To post to this group, send email to ats-lan...@googlegroups.com. >>>>>>> Visit this group at https://groups.google.com/group/ats-lang-users. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/ats-lang-users/c2f9d2b7-61f5-4142-b8b2-930147ee589d%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/ats-lang-users/c2f9d2b7-61f5-4142-b8b2-930147ee589d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> >>>>>> -- >> You received this message because you are subscribed to the Google Groups >> "ats-lang-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to ats-lang-user...@googlegroups.com <javascript:>. >> To post to this group, send email to ats-lan...@googlegroups.com >> <javascript:>. >> Visit this group at https://groups.google.com/group/ats-lang-users. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/ats-lang-users/d78409e2-aff1-4b96-98f3-eb3a5d20ff95%40googlegroups.com >> >> <https://groups.google.com/d/msgid/ats-lang-users/d78409e2-aff1-4b96-98f3-eb3a5d20ff95%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > -- You received this message because you are subscribed to the Google Groups "ats-lang-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscr...@googlegroups.com. To post to this group, send email to ats-lang-users@googlegroups.com. Visit this group at https://groups.google.com/group/ats-lang-users. To view this discussion on the web visit https://groups.google.com/d/msgid/ats-lang-users/716c8c61-d535-412d-8584-d4030d20801d%40googlegroups.com.