Maybe something like the attached patch?
Well that's actually the original patch (as opposed to V2) with relaxed test conditions. You can write that a bit nicer by setting the new PC directly after retrieving LR and returning early if it doesn't work. See "[PATCH 2/3] Add frame pointer unwinding as fallback on arm" from February 16th. That's the original algorithm; for aarch64 I just added a few defines and included arm_unwind.c.
It's in fact a bit annoying for my use case as the non-CFI stack sections are mostly in between CFI-enabled stack sections here. However, I can accept this.
cheers, Ulf