Re: Q: How to parse stream of text (from java.io.Reader) via instaparse with minimal input consuption
Hi, I finally came up with a simple function that incremenally consumes the input stream, building the head of the input and calls an instaparse parser when it encounters a ]. If this fails, the function recurs. This function is not even dependent on instaparse directly. You can find the source at https://github.com/henrik42/extended-lisp-reader/blob/master/src/extended_lisp_reader/stream_parser.clj In https://github.com/henrik42/extended-lisp-reader/blob/master/src/extended_lisp_reader/instaparse_adapter.clj you find (parser-for) that builds an instaparse parser that parses input text from a stream/Reader. Henrik -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Q: How to parse stream of text (from java.io.Reader) via instaparse with minimal input consuption
Hi, instaparse does not support readers and it cannot at the moment mainly because it uses JDK RegEx which do not support readers (thanks to Mark Engelbert for supplying this info). IMHO you cannot greedyly consume a reader and then decide that a match was not found due to the side-effect of the mutating/destructive consumption. Reading (i.e. consuming) the reader until I find the terminal character is the way to go. But the embedded language could also contain the terminal char. So the solution I came up with consumes the reader until it finds the terminal char, tries to successfully find a complete parse of this head and continues if it does not find such a parse (this is not elegent but in practice it works for me). So if/when I find a parseable head the tail is still in the reader and can be consumed by someone else (e.g. the caller of my function which I do not control). This scheme is a non-greedy-parse. I'm using ] as the terminal char. So the Grammar/embeded language should have balanced [...]. Otherwise the non-greddy-parse might miss a parseable head that is still ahead of the first head that we find. I'll post my solution hopefully next week and put it on github. Stay tuned. Regards, Henrik -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Q: How to parse stream of text (from java.io.Reader) via instaparse with minimal input consuption
If you have any control over the writer end you could also choose a specific terminal character that is not otherwise in your grammar and insert it before the next data set. Then you could be assured of a full parse. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Q: How to parse stream of text (from java.io.Reader) via instaparse with minimal input consuption
I didn't know Instaparse supported readers. Is there no way to read from the stream into a string until the terminal character for your grammar is found and parse that ? I assume you have no control over the source data? -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Q: How to parse stream of text (from java.io.Reader) via instaparse with minimal input consuption
How structured/delimited is the other part? Could you write your parser in such a way as to parse what you want and then return the rest as a single chunk you could then pass to something else? On Tuesday, January 6, 2015 at 11:50:26 AM UTC-5, henrik42 wrote: Hi all, I want to parse a stream of text which comes from a java.io.Reader with https://github.com/Engelberg/instaparse. But the stream of text will only start with what can be parsed by my grammar. The rest of the text stream must be consumed/parsed with some other grammar. I know of instparse's :total parse which can be used to parse as much as possible and it will give you what could not be parsed. But the text comes from a stream. And further processing will consume this (statefull) stream/Reader. I cannot just slurp the Reader, give instaparse the String and then process the String for the not-parsed-part. And I cannot unread that much (it's a PushbackReader). So my question is: is there a way to use instaparse with a Reader/Stream in such a way, that instaparse will consume exactly those Chars/Bytes that are covered by the resulting parse-tree and leave any following data un-read in the Reader? (Since it is a PushbackReader with buffersize=1 I could accept a 1-char-lookahead read) Any idea? Henrik -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Q: How to parse stream of text (from java.io.Reader) via instaparse with minimal input consuption
Hi all, I want to parse a stream of text which comes from a java.io.Reader with https://github.com/Engelberg/instaparse. But the stream of text will only start with what can be parsed by my grammar. The rest of the text stream must be consumed/parsed with some other grammar. I know of instparse's :total parse which can be used to parse as much as possible and it will give you what could not be parsed. But the text comes from a stream. And further processing will consume this (statefull) stream/Reader. I cannot just slurp the Reader, give instaparse the String and then process the String for the not-parsed-part. And I cannot unread that much (it's a PushbackReader). So my question is: is there a way to use instaparse with a Reader/Stream in such a way, that instaparse will consume exactly those Chars/Bytes that are covered by the resulting parse-tree and leave any following data un-read in the Reader? (Since it is a PushbackReader with buffersize=1 I could accept a 1-char-lookahead read) Any idea? Henrik -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups Clojure group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.