Troy, thanks, it would be great to take a look at how you used Python's AST to do it. Send over a link when you get a chance.
Josh On Mon, Feb 8, 2021 at 7:22 PM Troy Zimmerman <tazimmer...@me.com.invalid> wrote: > I have a library that uses Python’s AST module to parse Python expressions > and map them to Arrow Dataset expressions. I could extract the AST bits > into a repo if you’re interested. It’s really simple but could serve as > inspiration. > > It allows us to do things like: > > table = path.read_table(“valid_from < date <= valid_to and security_id in > [...]”) > > which is pretty handy when you’re in IPython or Jupyter. > > > On Feb 8, 2021, at 15:23, Josh Mayer <joshuaama...@gmail.com> wrote: > > > > It would be useful to be able to create a filter expression from a > string, > > e.g. "date == '2020-01-01' and value >= 1" instead of (field("date") == > > '2020-01-01') & (field("value") >= 1). > > > > There are some existing libraries that make it pretty easy to do in > Python > > (see here < > https://gist.github.com/josham/e5a13a16e9f18d7b9056127ac522cf23>) > > though an old issue ARROW-3458 > > <https://issues.apache.org/jira/browse/ARROW-3458> suggests using Antlr > and > > C++. If a Python only solution is OK I'd be happy to work on adding the > > feature. If Antlr/C++ is preferred I can help with the grammar and > testing > > but probably not the best person to do the C++ work. > > > > Josh >