Hi all, I still had a couple of ยข to give so here they are.
On Wed, Apr 22, 2020 at 09:30:23AM -0500, Joshua Watt wrote: > > On 4/22/20 5:11 AM, Richard Purdie wrote: > > On Tue, 2020-04-21 at 21:51 -0500, Joshua Watt wrote: > > > On Sun, Apr 19, 2020 at 12:04 PM Richard Purdie > > > <[email protected]> wrote: > > > Continuing the discussion from the OE TSC meeting, now that I've had > > > a little time to digest the idea: I don't think the filter functions > > > are necessarily a bad idea. I think the real problem is how to > > > prevent them from becoming API maintenance and debugging problems, > > > and as you stated, performance issues. I think there might be a few > > > ways to address this, and most of it comes down to strictly defining > > > the API. > > > Firstly, I don't think we want to allow (at least at this point) > > > completely arbitrary python functions to be called as a filter. > > > Hopefully, we can figure out the few filter manipulations that are > > > required and design the API such that those can be used without > > > having to open it all the way up. > > I think that is a sensible idea. > > > > The good news is we know roughly what the current needs look like, the > > code is in lib/oe/classextend.py: > > > > def extend_name(self, name): > > if name.startswith("kernel-") or name == "virtual/kernel": > > return name > > if name.startswith("rtld"): > > return name > > if name.endswith("-crosssdk"): > > return name > > if name.endswith("-" + self.extname): > > name = name.replace("-" + self.extname, "") > > if name.startswith("virtual/"): > > subs = name.split("/", 1)[1] > > if not subs.startswith(self.extname): > > return "virtual/" + self.extname + "-" + subs > > return name > > if name.startswith("/") or (name.startswith("${") and > > name.endswith("}")): > > return name > > if not name.startswith(self.extname): > > return self.extname + "-" + name > > return name > > > > def map_depends(self, dep): > > if dep.endswith(("-native", "-native-runtime")) or ('nativesdk-' > > in dep) or ('cross-canadian' in dep) or ('-crosssdk-' in dep): > > return dep > > else: > > # Do not extend for that already have multilib prefix > > var = self.d.getVar("MULTILIB_VARIANTS") > > if var: > > var = var.split() > > for v in var: > > if dep.startswith(v): > > return dep > > return self.extend_name(dep) > > > > which apart from one getVar call, is all static string manipulation. > > Could be better, could be worse. Its surprisingly hard to manipulate a > > depends string (we also have to use the explode_deps function as the > > strings can have versions in). > > > > > Secondly, I think we should assume that any > > > filter function is "pure" [1], meaning it only takes in the input > > > variable value string, possibly some other manipulation arguments, > > > and outputs the new variable value string. Notable, it should *not* > > > take in the datastore object, since this is inherently global data > > > that can change at any time. Hopefully, this will allow optimizations > > > and prevent performance issues because bitbake can keep a cache of > > > filter function inputs mapped to outputs. Since the function outputs > > > *only* depend on the inputs, the filter can avoid being called > > > multiple times if the input set has been seen before. > > That definitely looks like a good move and alleviates some of the > > concerns I have about it. > > > > I do wonder how we handle MULTILIB_VARIANTS in the above example. > > Perhaps some kind of cut down key/value pairs or allow parameters? > > > > > I can dream up a few ways this might look. In this example, I'm > > > defining a filter function that simply adds a defined prefix to every > > > value in a space separated variable. > > > > > > First, the filter function itself: > > > > > > def filter_prefix(in, prefix): > > > return ' '.join(prefix + v for v in in.split()) > > > > > > As for how to express this in a recipe, I can think of a few ways. > > > This first one is a variable flag. The base flag is "filter" and the > > > suffix indicates which filter is applied (in this case, the "prefix" > > > function). The value are the extra arguments passed to the function. > > > > > > DEPENDS[filter:prefix] = "${MLPREFIX}" > > > > > > Another option would be to encode the function and arguments in the > > > flag value, which might also allow filters to be chained (although, I > > > would hope that's a rare requirement!): > > > > > > DEPENDS[filter] = "prefix ${MLPREFIX}" > > > > > > Chaining might look something like: > > > > > > DEPENDS[filter] = "prefix ${MLPREFIX}; suffix ${MY_SUFFIX}" > > > > > > The great part about pure functions here is that the property is > > > transitive, so the entire result of chain with both operations (e.g. > > > suffix(prefix(${DEPENDS}))) can be cached if desired. > > Those are good suggestions. Thinking out loud, I'm kind of drawn to > > something which looks a bit more function like: > > > > DEPENDS[filter] = "prefix_filter('${MLPREFIX}') > > suffix_filter('${MY_SUFFIX}')" > > I was trying to avoid anything too complex to parse.... but I wonder if we > can enlist the help of the python interpreter. If we have an appropriate > delimiter, something like this might work: > > def apply_filter(value, filter_expression): > class Val: > def __init__(self, value): > self.value = value > def prefix(self, prefix): > self.value = ' '.join(prefix + v for v in self.__value.split()) > def suffix(self, suffix): > self.value = ' '.join(v + suffix for v in self.__value.split()) > > v = Val(value) > for e in filter_expression.split(';'): > e = e.strip() > if not e: > continue > eval(e, globals={}, locals={'v': v}) > return v.value > > Obviously, you might want something a little more dynamic for registering the > actual filter functions into the class, but the important part is that the > eval() is pretty restrictive in what filter functions are allowed to access > or do. > > This would allow the user to write a fairly natural expression like: > > DEPENDS[filter] = "v.prefix('${MLPREFIX}'); v.suffix('${MY_SUFFIX}') > > but, things like this also still work (since the code ignores empty > expressions): > > DEPEND[filter] += "; v.prefix('foo')" > > The hardest part is finding a delimiter that isn't likely to show up in the > actual variable values, since the string has to be split and duplicates > ignored before it is parsed by the python code. > I summoned the Google god and I found this stackoverflow on how to execute multiple lines instead of the one eval is expecting: https://stackoverflow.com/a/39381428 https://docs.python.org/3/library/ast.html If I didn't completely misunderstood what was suggested, that should eliminate the need for splitting. Though... I don't know how all of this works wrt basic security against code injection (e.g.: DEPENDS[filter] += "import shutil; shutil.rmtree('/')")? I barely know Python :) Just to try to grasp what's suggested, are all filter flags going to be resolved/executed at one specific point in time or will the filter of each variable be called explicitely when needed? If the former, my simple question is: aren't we just adding one more ordering issue somehow? I'm pretty sure someone will want one day to have this filter flag expanded/resolved at a different time that we would set it to. For the latter, seems error prone but at least we have the ability to fix a missing or misplaced filter rather easily. Hopefully I haven't completely misunderstood the suggestion. I'm afraid that's as far as I can participate to the issue from my eagle view on this as I'm lacking knowledge in Python and bitbake :/ Let me know if there's anything I can do, Thanks, Quentin
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#1066): https://lists.openembedded.org/g/openembedded-architecture/message/1066 Mute This Topic: https://lists.openembedded.org/mt/73131657/21656 Group Owner: [email protected] Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
