Re: [Haskell-cafe] Maintaining lambdabot
On Wed, Feb 20, 2013 at 11:19 AM, Gwern Branwen gwe...@gmail.com wrote: On Wed, Feb 20, 2013 at 1:35 PM, Jan Stolarek jan.stola...@p.lodz.pl wrote: Gwern, and what do you think about James' fork of lambdabot? It seems that there was a lot of work put into it and that this is indeed a good starting point to continue development. I haven't looked at the diffs; if as he says the security around the evaluation has been weakened, that's a deal-breaker until it's fixed. lambdabot can't be insecure since it will be run in a public IRC. I absolutely agree. There are sandboxing things we can do around the lambdabot instance, but Haskell has a lot of opportunities for statically disallowing questionable things. I would like to start our defense there and add layers around that. My real reason for reviving this thread: Can I get a status update, please? Thanks, Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Lazy object deserialization
Scott Lawrence byt...@gmail.com writes: All the object serialization/deserialization libraries I could find (pretty much just binary and cereal) seem to be strict with respect to the actual data being serialized. Binary became strict between 0.4.4 and 0.5, I think doing so improved the speed of GHC. I had a file format wrapper at the time which depended on it being lazy, and was stuck at 0.4.4 for a long time, until somebody helped me with a workaround. Performance is a bit worse than it was with 0.4.4, but I think it's better to stay with current code. The code in question is in the darcs repository at: http://malde.org/~ketil/biohaskell/biosff the relevant patch is named Update to use [...] -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] PPDP 2013: Call for Papers
= Call for papers 15th International Symposium on Principles and Practice of Declarative Programming PPDP 2013 Special Issue of Science of Computer Programming (SCP) Madrid, Spain, September 16-18, 2013 (co-located with LOPSTR 2013) http://users.ugent.be/~tschrijv/PPDP2013/ == PPDP 2013 is a forum that brings together researchers from the declarative programming communities, including those working in the logic, constraint and functional programming paradigms, but also embracing a variety of other paradigms such as visual programming, executable specification languages, database languages, and knowledge representation languages. The goal is to stimulate research in the use of logical formalisms and methods for specifying, performing, and analysing computations, including mechanisms for mobility, modularity, concurrency, object-orientation, security, verification and static analysis. Papers related to the use of declarative paradigms and tools in industry and education are especially solicited. Topics of interest include, but are not limited to: * Functional programming * Logic programming * Answer-set programming * Functional-logic programming * Declarative visual languages * Constraint Handling Rules * Parallel implementation and concurrency * Monads, type classes and dependent type systems * Declarative domain-specific languages * Termination, resource analysis and the verification of declarative programs * Transformation and partial evaluation of declarative languages * Language extensions for security and tabulation * Probabilistic modelling in a declarative language and modelling reactivity * Memory management and the implementation of declarative systems * Practical experiences and industrial application This year the conference will be co-located with the 23nd International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2013) and held in cooperation with ACM SIGPLAN. The conference will be held in Madrid, Spain. Previous symposia were held at Leuven (Belgium), Odense (Denmark), Hagenberg (Austria), Coimbra (Portugal), Valencia (Spain), Wroclaw (Poland), Venice (Italy), Lisboa (Portugal), Verona (Italy), Uppsala (Sweden), Pittsburgh (USA), Florence (Italy), Montreal (Canada), and Paris (France). Papers must describe original work, be written and presented in English, and must not substantially overlap with papers that have been published or that are simultaneously submitted to a journal, conference, or workshop with refereed proceedings. Work that already appeared in unpublished or informally published workshop proceedings may be submitted (please contact the PC chair in case of questions). Proceedings will be published by ACM Press* After the symposium, a selection of the best papers will be invited to extend their submissions in the light of the feedback solicited at the symposium. The papers are expected to include at least 30% extra material over and above the PPDP version. Then, after another round of reviewing, these revised papers will be published in a special issue of SCP with a target publication date by Elsevier of 2014. Important Dates Abstract Submission:May 27, 2013 Paper submission: May 30, 2013 Notification: July 4, 2013 Camera-ready: July 21, 2013 Symposium: September 16-18, 2013 Invites for SCP:October 2, 2013 Submission of SCP: December 11, 2013 Notification from SCP: February 22, 2014 Camera-ready for SCP: March 14, 2014 Authors should submit an electronic copy of the paper (written in English) in PDF. Each submission must include on its first page the paper title; authors and their affiliations; abstract; and three to four keywords. The keywords will be used to assist us in selecting appropriate reviewers for the paper. Papers should consist of no more than 12 pages, formatted following the ACM SIG proceedings template (option 1). The 12 page limit must include references but excludes well-marked appendices not intended for publication. Referees are not required to read the appendices, and thus papers should be intelligible without them. Program Committee Sergio Antoy Portland State University, USA Manuel Carro IMDEA Software Institute, Spain Iliano Cervesato Carnegie Mellon University, Qatar Agostino DovierUniversita degli Studi di Udine, Italy Maria Garcia de la Banda Monash University, Australia Ralf Hinze University
Re: [Haskell-cafe] Maintaining lambdabot
On Mar 14, 2013, at 11:08 PM, Jason Dagit dag...@gmail.com wrote: My real reason for reviving this thread: Can I get a status update, please? Sure. I don't have as much time as I'd like these days for open-source projects, but with Jan's help the code has been cleaned up quite a bit in general, and a lot of old bit-rot has been fixed. I have not specifically addressed security yet, but it's not as dire a situation as I think it sounded from my earlier remarks. Basically, security is currently a bit DIY. If there are any holes, they are probably quite subtle because there are very, very few moving parts on lambdabot's side. mueval and Safe Haskell are used to enforce resource limitations and type-safety, respectively, but -fpackage-trust is not (yet) turned on. So all packages installed on the system are available for the interpreted code (if imported using a command such as @let import Foo.Bar), as long as Safe Haskell permits. This is the main potential security hole - the system may have modules installed and marked Trustworthy that the administrator has not explicitly decided whether to trust, and which are not, in fact, as safe as the author asserts. Currently, lambdabot trusts such packages. My intention is to add some commands (available only to lambdabot admins) or maybe a static configuration option for managing a list of packages to explicitly trust, with all others being untrusted. And of course, for a production system an OS-enforced sandbox is a great idea no matter how secure you believe the code do be. Aside from that caveat, I think that the code could be put on hackage today and I'd have few, if any, reservations about it. -- James ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Open-source projects for beginning Haskell students?
On 13-03-11 10:52 PM, Michael Orlitzky wrote: On 03/11/2013 11:48 AM, Brent Yorgey wrote: So I'd like to do it again this time around, and am looking for particular projects I can suggest to them. Do you have an open-source project with a few well-specified tasks that a relative beginner (see below) could reasonably make a contribution towards in the space of about four weeks? I'm aware that most tasks don't fit that profile, but even complex projects usually have a few simple-ish tasks that haven't yet been done just because no one has gotten around to it yet. It's not exciting, but adding doctest suites with examples to existing packages would be a great help. * Good return on investment. * Not too hard. * The project is complete when you stop typing. In the similar spirit, many existing projects would benefit from a benchmark suite. It's a fairly simple but somewhat tedious process, good for a team work practice: 1. Take a well-defined task, like parsing JSON for example. 2. Devise a test scenario that includes the task. 3. Make a list of all libraries on Hackage which (claim to) do the task. 4. Write a simple test for each of the libraries. 5. Combine all the tests into a Criterion test suite. 6. Publish the test suite and the benchmark results. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] attoparsec and backtracking
I have a couple of problems with attoparsec which I think are related to its always backtrack nature, but maybe there's some other way to solve the same problems. The first is that it's hard to get the right error msg out. For instance, I have a parser that tries to parse a number with an optional type suffix. It's an error if the suffix is unrecognized: p_num :: A.Parser Score.TypedVal p_num = do num - p_untyped_num suffix - A.option ((:) $ A.letter_ascii) case Score.code_to_type suffix of Nothing - fail $ p_num expected suffix in [cdsr]: ++ show suffix Just typ - return $ Score.Typed typ num However, which error msg shows up depends on the order of the (|) alternatives, and in general the global structure of the entire parser, because I think it just backtracks and then picks the last failing backtrack. Even after carefully rearranging all the parsers it seems impossible to get this particular error to bubble up to the top. The thing is, as soon as I see an unexpected suffix I know I can fail entirely right there, with exactly that error msg, but since there's no way to turn off backtracking I think there's no way to do that. The second thing is that I want to lex a single token. I thought I could just parse a term and then see how far I got and then use that index to splitAt the input. But attoparsec doesn't keep track of the current index, so I wrote ((,) $ p_term * A.takeByteString), and then I can use the length of the left over bytestring. But due to backtracking, that changes what p_term will parse. Since takeByteString always succeeds, it will cause p_term to backtrack until it finds some prefix that will match. The result is that instead of failing to parse, 1. will lex as (1, .). Since I integrate lexing and parsing (as is common with combinator parsers), and since it seems I can't keep track of byte position with attoparsec, I think I'm out of luck trying to do this the easy way. I think I have to write a separate lexer that tries to have the same behaviour as the parser, but is all separate code. I know attoparsec was designed for speed, and error reporting and source positions are secondary. Am I simply asking too much of it? I originally used parsec, but parsing speed is my main bottleneck, so I don't want to give ground there. Is there a clever way to get attoparsec to do what I want? Or a ByteString or Text parser out there which is fast, but can not backtrack, or keep track of input position? I've heard some good things about traditional alex+happy... of course it would mean a complete rewrite but might be interesting. Has anyone compared the performance of attoparsec vs. alex+happy? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Optimization flag changing result of code execution
I was trying to solve a computational problem form James P Sethna's book Statistical Mechanics: Entropy, Order Parameters, and Complexity[1]. The problem is on page 19 of the pdf linked and is titled Six degrees of separation. For it I came up with this code: http://hpaste.org/84114 It runs fine when compiled with -O0 and consistently yields an answer around 10, but with -O1 and -O2 it consistently gives an answer around 25. Can somebody explain what is happening here? [1] http://pages.physics.cornell.edu/~sethna/StatMech/EntropyOrderParametersComplexity.pdf Azeem ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optimization flag changing result of code execution
Hey Azeem, have you tried running the same calculation using rationals? Theres some subtleties to writing numerically stable code using floats and doubles, where simple optimizations change the orders of operations in ways that *significantly* change the result. In this case it looks like you're averaging the averages, which i *believe* can get pretty nasty in terms of numerical precision. Rationals would be a bit slower, but you could then sort out which number is more correct. On Fri, Mar 15, 2013 at 4:07 PM, Azeem -ul-Hasan aze...@live.com wrote: I was trying to solve a computational problem form James P Sethna's book Statistical Mechanics: Entropy, Order Parameters, and Complexity[1]. The problem is on page 19 of the pdf linked and is titled Six degrees of separation. For it I came up with this code: http://hpaste.org/84114 http://hpaste.org/84114 It runs fine when compiled with -O0 and consistently yields an answer around 10, but with -O1 and -O2 it consistently gives an answer around 25. Can somebody explain what is happening here? [1] http://pages.physics.cornell.edu/~sethna/StatMech/EntropyOrderParametersComplexity.pdf Azeem ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Resource Limits for Haskell
Hey folks, Have you ever wanted to implement this function in Haskell? -- | Forks a thread, but kills it if it has more than 'limit' -- bytes resident on the heap. forkIOWithSpaceLimit :: IO () - {- limit -} Int - IO ThreadId Well, now you can! I have a proposal and set of patches here: http://hackage.haskell.org/trac/ghc/wiki/Commentary/ResourceLimits http://hackage.haskell.org/trac/ghc/ticket/7763 There is a lot of subtlety in this space, largely derived from the complexity of interpreting GHC's current profiling information. Your questions, comments and suggestions are greatly appreciated! Cheers, Edward ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Resource Limits for Haskell
On Fri, Mar 15, 2013 at 5:17 PM, Edward Z. Yang ezy...@mit.edu wrote: There is a lot of subtlety in this space, largely derived from the complexity of interpreting GHC's current profiling information. Your questions, comments and suggestions are greatly appreciated! How secure is this? One of the reasons for forking a process and then killing it after a timeout in lambdabot/mueval is because a thread can apparently block the GC from running with a tight enough loop and the normal in-GHC method of killing threads doesn't work. Can one simultaneously in a thread allocate ever more memory and suppress kill signals? -- gwern http://www.gwern.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Maintaining lambdabot
On Fri, Mar 15, 2013 at 9:19 AM, James Cook mo...@deepbondi.net wrote: On Mar 14, 2013, at 11:08 PM, Jason Dagit dag...@gmail.com wrote: My real reason for reviving this thread: Can I get a status update, please? Sure. I don't have as much time as I'd like these days for open-source projects, but with Jan's help the code has been cleaned up quite a bit in general, and a lot of old bit-rot has been fixed. I have not specifically addressed security yet, but it's not as dire a situation as I think it sounded from my earlier remarks. Basically, security is currently a bit DIY. If there are any holes, they are probably quite subtle because there are very, very few moving parts on lambdabot's side. mueval and Safe Haskell are used to enforce resource limitations and type-safety, respectively, but -fpackage-trust is not (yet) turned on. So all packages installed on the system are available for the interpreted code (if imported using a command such as @let import Foo.Bar), as long as Safe Haskell permits. This is the main potential security hole - the system may have modules installed and marked Trustworthy that the administrator has not explicitly decided whether to trust, and which are not, in fact, as safe as the author asserts. Currently, lambdabot trusts such packages. My intention is to add some commands (available only to lambdabot admins) or maybe a static configuration option for managing a list of packages to explicitly trust, with all others being untrusted. And of course, for a production system an OS-enforced sandbox is a great idea no matter how secure you believe the code do be. Aside from that caveat, I think that the code could be put on hackage today and I'd have few, if any, reservations about it. I haven't been following the thread closely. Is there also a github? If so, where? Some of us figured out a bug fix for the quotes plugin and I'll send a pull request if I get a chance. Thanks for the update, Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Resource Limits for Haskell
The particular problem you're referring to is fixed if you compile all your libraries with -falways-yield; see http://hackage.haskell.org/trac/ghc/ticket/367 I believe that it is possible to give a guarantee that the kill signal will hit the thread in a timely fashion. The obvious gap in our coverage at the moment is that there may be some primops that infinite loop, and there are probably other bugs, but I do not believe they are insurmountable. Edward Excerpts from Gwern Branwen's message of Fri Mar 15 14:39:50 -0700 2013: On Fri, Mar 15, 2013 at 5:17 PM, Edward Z. Yang ezy...@mit.edu wrote: There is a lot of subtlety in this space, largely derived from the complexity of interpreting GHC's current profiling information. Your questions, comments and suggestions are greatly appreciated! How secure is this? One of the reasons for forking a process and then killing it after a timeout in lambdabot/mueval is because a thread can apparently block the GC from running with a tight enough loop and the normal in-GHC method of killing threads doesn't work. Can one simultaneously in a thread allocate ever more memory and suppress kill signals? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Maintaining lambdabot
On Mar 15, 2013, at 2:45 PM, Jason Dagit dag...@gmail.com wrote: I haven't been following the thread closely. Is there also a github? If so, where? Some of us figured out a bug fix for the quotes plugin and I'll send a pull request if I get a chance. Yep, there is[1]. I'm not sure what the specific bug is that you are referring to, but it's possible it doesn't exist anymore - a large part of the quotes plugin has been rewritten (actually outsourced to a fortune-mod clone written in Haskell called misfortune). If it still does, then of course I'd be happy to accept a fix :) [1] https://github.com/mokus0/lambdabot ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] attoparsec and backtracking
On 3/15/13 3:29 PM, Evan Laforge wrote: However, which error msg shows up depends on the order of the (|) alternatives, and in general the global structure of the entire parser, because I think it just backtracks and then picks the last failing backtrack. Even after carefully rearranging all the parsers it seems impossible to get this particular error to bubble up to the top. The thing is, as soon as I see an unexpected suffix I know I can fail entirely right there, with exactly that error msg, but since there's no way to turn off backtracking I think there's no way to do that. I had some similar issues recently. The trick is figuring out how to convince attoparsec to commit to a particular alternative. For example, consider the grammar: A (B A)* C; where if the B succeeds then we want to commit to parsing an A (and if it fails then return A's error, not C's). To simplify things, let's drop the leading A since it's not part of the problem. And let's try to parse an invalid string like BX (or BABX). The key point is that, bad = (pB * pure (:) * pA * bad) | (pC * pure []) is different than, good = do e - eitherP pB pC -- (Left $ pB) | (Right $ pC) case e of Left _ - (:) $ pA * good Right _ - pure [] In particular, the first one is bad (for our purposes) because due to hoisting the choice up high, after parsing the B we fail to commit, so when parsing A fails we'll backtrack over the B and try C instead. Assuming C doesn't overlap with B, we'll then report C's error. Whereas the latter is good because due to pushing the choice down, once we've parsed B (or C) we're committed to that choice; so when A fails, we'll report A's error (or backtrack to the lowest choice that dominates the call to good). Attoparsec does indeed just report the failure generated by the final parse, so you'll have to refactor things to recognize which sort of token you're looking for (e.g., p_num vs p_identifier or whatever), and then commit to that choice before actually parsing the token. It's not very modular that way, but I think that's the only option right now. It shouldn't be too hard to design a combinator for doing a hard commit (by discarding the backtrack continuation); but that has modularity issues of its own... Another option, of course, is to drop down to performing lexing on the ByteString itself (e.g., [1]) and then wrap those individual lexers to work as attoparsec Parsers. Even if using attoparsec for the heavy lifting, this is a good approach for maximizing performance of the lexing step. [1] http://hackage.haskell.org/package/bytestring-lexing -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Maintaining lambdabot
On Fri, Mar 15, 2013 at 3:30 PM, James Cook mo...@deepbondi.net wrote: On Mar 15, 2013, at 2:45 PM, Jason Dagit dag...@gmail.com wrote: I haven't been following the thread closely. Is there also a github? If so, where? Some of us figured out a bug fix for the quotes plugin and I'll send a pull request if I get a chance. Yep, there is[1]. I'm not sure what the specific bug is that you are referring to, but it's possible it doesn't exist anymore - a large part of the quotes plugin has been rewritten (actually outsourced to a fortune-mod clone written in Haskell called misfortune). If it still does, then of course I'd be happy to accept a fix :) [1] https://github.com/mokus0/lambdabot Awesome. I believe the bug is still there. The type for the quote db is: type Key= P.ByteString type Quotes = M.Map Key [P.ByteString] Which leaves the possibility that a key exists but there are no quotes. This is problematic for the current version of random. I glanced at your new version and it wasn't clear to me if it's still a problem (I suspect it is). One bandaid for this is to change the lines below: https://github.com/mokus0/lambdabot/blob/master/src/Lambdabot/Plugin/Quote.hs#L161 https://github.com/mokus0/lambdabot/blob/master/src/Lambdabot/Plugin/Quote.hs#L166 In both cases Just qs could be changed to Just qs@(_:_) and then empty lists would fall through to the default case. The other fix is to prune out degenerate entries (where key maps to the empty list). I believe that would be fixed in the serialization function: moduleSerialize = Just mapListPackedSerial Changing that to something like: moduleSerialize = Just mapListPackedSerialSansEmpties where mapListPackedSerialSansEmpties = mapListPackedSerial { serialize = (serialize mapListPackedSerial) . Map.filter (not.null) } Perhaps that should be added to the Serial module as an alternative to mapListPackedSerial. I haven't tested any of the above code (or even tried to compile it). Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Maintaining lambdabot
On Fri, Mar 15, 2013 at 4:31 PM, Jason Dagit dag...@gmail.com wrote: On Fri, Mar 15, 2013 at 3:30 PM, James Cook mo...@deepbondi.net wrote: On Mar 15, 2013, at 2:45 PM, Jason Dagit dag...@gmail.com wrote: I haven't been following the thread closely. Is there also a github? If so, where? Some of us figured out a bug fix for the quotes plugin and I'll send a pull request if I get a chance. Yep, there is[1]. I'm not sure what the specific bug is that you are referring to, but it's possible it doesn't exist anymore - a large part of the quotes plugin has been rewritten (actually outsourced to a fortune-mod clone written in Haskell called misfortune). If it still does, then of course I'd be happy to accept a fix :) [1] https://github.com/mokus0/lambdabot Awesome. I believe the bug is still there. The type for the quote db is: type Key= P.ByteString type Quotes = M.Map Key [P.ByteString] Which leaves the possibility that a key exists but there are no quotes. This is problematic for the current version of random. I glanced at your new version and it wasn't clear to me if it's still a problem (I suspect it is). One bandaid for this is to change the lines below: https://github.com/mokus0/lambdabot/blob/master/src/Lambdabot/Plugin/Quote.hs#L161 https://github.com/mokus0/lambdabot/blob/master/src/Lambdabot/Plugin/Quote.hs#L166 In both cases Just qs could be changed to Just qs@(_:_) and then empty lists would fall through to the default case. The other fix is to prune out degenerate entries (where key maps to the empty list). I believe that would be fixed in the serialization function: moduleSerialize = Just mapListPackedSerial Changing that to something like: moduleSerialize = Just mapListPackedSerialSansEmpties where mapListPackedSerialSansEmpties = mapListPackedSerial { serialize = (serialize mapListPackedSerial) . Map.filter (not.null) } Perhaps that should be added to the Serial module as an alternative to mapListPackedSerial. I haven't tested any of the above code (or even tried to compile it). I was going to start making these changes and I noticed that it doesn't currently build with ghc 7.4.1 w/Haskell Platform: https://travis-ci.org/dagit/lambdabot/builds/5541375 Do you know if the constraints on: regex-posix-0.95.1 regex-compat-0.95.1 Need to be what they are? Could we relax them without breaking anything? Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] attoparsec and backtracking
Evan Laforge wrote: The first is that it's hard to get the right error msg out. For instance, I have a parser that tries to parse a number with an optional type suffix. It's an error if the suffix is unrecognized: p_num :: A.Parser Score.TypedVal p_num = do num - p_untyped_num suffix - A.option ((:) $ A.letter_ascii) case Score.code_to_type suffix of Nothing - fail $ p_num expected suffix in [cdsr]: ++ show suffix Just typ - return $ Score.Typed typ num I think the mistake here is to parse something and then decide if its it valid. It should be the parser which decides whether its valid. So rather than: suffix - A.option ((:) $ A.letter_ascii) try: typ - A.choice [ {- list or valid suffix parsers -} ] return $ Score.Typed typ num However, which error msg shows up depends on the order of the (|) alternatives, and in general the global structure of the entire parser, because I think it just backtracks and then picks the last failing backtrack. I'm not sure if what I've offered will help, but its worth a try. Even after carefully rearranging all the parsers it seems impossible to get this particular error to bubble up to the top. Yes, I've found it impossible to force attoparsec to fail a parse. I think that is intended as a feature. The thing is, as soon as I see an unexpected suffix I know I can fail entirely right there, with exactly that error msg, but since there's no way to turn off backtracking I think there's no way to do that. Yes, that's my impression. snip I originally used parsec, but parsing speed is my main bottleneck, so I don't want to give ground there. We you using Parsec as a token parser or as a Char parser. Obviously the second is going to be slow in comparison to the first. I've heard some good things about traditional alex+happy... of course it would mean a complete rewrite but might be interesting. I've user Alex with both Parsec and Happy, where speed was strong secondary goal. Personally I much prefer Parsec; IMO its easier to debug and extend and modify that Happy based parsers. I also know some people prefer Happy. Has anyone compared the performance of attoparsec vs. alex+happy? I haven't, nor have I compared those two with alex+parsec. It would be an interesting experiment. Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] attoparsec and backtracking
On 16 March 2013 12:54, Erik de Castro Lopo mle...@mega-nerd.com wrote: Evan Laforge wrote: The first is that it's hard to get the right error msg out. For instance, I have a parser that tries to parse a number with an optional type suffix. It's an error if the suffix is unrecognized: p_num :: A.Parser Score.TypedVal p_num = do num - p_untyped_num suffix - A.option ((:) $ A.letter_ascii) case Score.code_to_type suffix of Nothing - fail $ p_num expected suffix in [cdsr]: ++ show suffix Just typ - return $ Score.Typed typ num I think the mistake here is to parse something and then decide if its it valid. It should be the parser which decides whether its valid. So rather than: suffix - A.option ((:) $ A.letter_ascii) try: typ - A.choice [ {- list or valid suffix parsers -} ] return $ Score.Typed typ num However, which error msg shows up depends on the order of the (|) alternatives, and in general the global structure of the entire parser, because I think it just backtracks and then picks the last failing backtrack. I'm not sure if what I've offered will help, but its worth a try. Even after carefully rearranging all the parsers it seems impossible to get this particular error to bubble up to the top. Yes, I've found it impossible to force attoparsec to fail a parse. I think that is intended as a feature. I don't know about a feature, but I tried adding polyparse-style commit semantics to attoparsec and couldn't do so without making it rather noticeably slower. The thing is, as soon as I see an unexpected suffix I know I can fail entirely right there, with exactly that error msg, but since there's no way to turn off backtracking I think there's no way to do that. Yes, that's my impression. snip I originally used parsec, but parsing speed is my main bottleneck, so I don't want to give ground there. We you using Parsec as a token parser or as a Char parser. Obviously the second is going to be slow in comparison to the first. I've heard some good things about traditional alex+happy... of course it would mean a complete rewrite but might be interesting. I've user Alex with both Parsec and Happy, where speed was strong secondary goal. Personally I much prefer Parsec; IMO its easier to debug and extend and modify that Happy based parsers. I also know some people prefer Happy. Has anyone compared the performance of attoparsec vs. alex+happy? I haven't, nor have I compared those two with alex+parsec. It would be an interesting experiment. Erik -- -- Erik de Castro Lopo http://www.mega-nerd.com/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- Ivan Lazar Miljenovic ivan.miljeno...@gmail.com http://IvanMiljenovic.wordpress.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] attoparsec and backtracking
Is it not possible to add an alternative (no pun intended) to | that supports the semantics Evan wants? I would agree that what attoparsec does for | of Alternative and mplus for MonadPlus is correct since e.g. the mplus laws say that a failure must be identity and therefore the following alternatives must be considered. I also find it very convenient that attoparsec works this way, and prefer it to what parsec does by default. However, I do not see why attoparsec cannot have a function || that on failure with consumed input does not evaluate the remaining alternatives. On 16/03/13 01:54, Erik de Castro Lopo wrote: Evan Laforge wrote: However, which error msg shows up depends on the order of the (|) alternatives, and in general the global structure of the entire parser, because I think it just backtracks and then picks the last failing backtrack. I'm not sure if what I've offered will help, but its worth a try. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe