Re: [Factor-talk] Peg Parsing Files

2020-11-23 Thread Chris Double
On Sun, Nov 22, 2020 at 6:15 AM Alexander Ilin wrote: > So, the question is, is it possible to use the peg vocab to create this > layered parsing architecture, where the first layer would consume a stream of > file input, and the second would consume a stream of tokens from the first > layer?

Re: [Factor-talk] Peg Parsing Files

2020-11-22 Thread John Benediktsson
Right now this is true: binary stream-seekable? But none of the decoders allow stream-seeking. Maybe we should support that for ascii and other 8-bit encodings... On Sun, Nov 22, 2020 at 3:17 PM Jon Harper wrote: > I didn't know about the seekable streams implementation in factor: > https://

Re: [Factor-talk] Peg Parsing Files

2020-11-22 Thread Jon Harper
I didn't know about the seekable streams implementation in factor: https://docs.factorcode.org/content/article-stream-protocol.html We have: "/tmp/toto" ascii stream-seekable? f "/tmp/toto" (file-reader) t ! it works, stream-seek can rewind or fast forward and stream-read1 will get the bytes

Re: [Factor-talk] Peg Parsing Files

2020-11-21 Thread Alexander Ilin
What's the problem backtracking through a file? Streams can have arbitrary positioning. Alternatively, there must be a simple way to "wrap" a random access file stream with an array-like interface, right? Don't we have a class like that somewhere? Like the reverse of and ? I will definitely use th

Re: [Factor-talk] Peg Parsing Files

2020-11-21 Thread Jon Harper
What you are describing reminds me of the built-in nesting of parsers with the ebnf tokenizer : https://docs.factorcode.org/content/article-peg.ebnf.tokenizers.html Not sure if it's really applicable to your use case though. As for streams, aren't packrat parsers backtracking (with memoization)?

Re: [Factor-talk] Peg Parsing Files

2020-11-21 Thread Alexander Ilin
I have another question to follow this up. I want to create a multilayer parser.The first layer would read a bunch of textual log files (with date-time in their names) and search for various types of events using PEG patterns per line and skipping noise. This would produce a stream of interesting e