cj-zhukov commented on PR #20122: URL: https://github.com/apache/datafusion/pull/20122#issuecomment-3839430300
### High-Level Overview This PR is an exploratory step to evaluate whether using a parser combinator library (`nom`) improves the clarity and robustness of the example documentation parsing logic. In a previous PR https://github.com/apache/datafusion/pull/19750, the parsing of subcommands and example metadata in `main.rs` docs was implemented using ad-hoc string manipulation. While that approach works, this PR experiments with replacing that logic using `nom` for two functions: - parse_subcommand_line - parse_metadata_line Personally, I found the `nom`-based implementation easier to read, reason about, and maintain. Expressing the grammar declaratively with a parser tool feels more natural for this kind of structured input, and the intent of the parsing logic is clearer compared to manual string slicing and conditionals. That said, this PR is intentionally limited in scope. `nom` is currently used only for these two parsing helpers, and introducing a new dependency for such a narrow use case may not be justified on its own. The main open question is whether DataFusion would benefit from using `nom` more broadly for similar parsing tasks in the future. If the project sees value in adopting `nom` for other parsing needs, this PR could serve as a small, contained starting point. Otherwise, it may be reasonable to stick with the existing ad-hoc approach to avoid dependency overhead. Feedback on whether this trade-off is worthwhile is very welcome. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
