Wow, this is excellent to hear about! It looks like we have something fun to dig into this weekend!
Thanks, Justin On Fri, Aug 8, 2025 at 5:03 AM Jens Dietrich via rb-general < [email protected]> wrote: > Introducing DALEQ: An Open-Source Tool for Assessing Java Binary > Equivalence > > We’re excited to announce the release of DALEQ — a new open-source tool > for analyzing and comparing Java binaries. DALEQ is designed to help > developers, security researchers, and build engineers assess whether two > .jar files built from the same source code are semantically equivalent, > even when they’re not bitwise identical. This is particularly useful for > comparing jars from Maven Central and jars produced via reproducible > builds, or generated by services like Oracle’s build-from-source or > Google’s Assured OSS. Although tools like diff or hash-based checks can > detect binary differences, they don’t explain why binaries differ, or > whether those differences matter. Bytecode-level differences can be caused > by changes in compilers or build pipelines — not necessarily by compromised > builds. DALEQ helps distinguish harmless variation from meaningful > divergence. > > How DALEQ Works > > DALEQ focuses on Java bytecode comparison, though it can also analyze > resources and metadata in jars. At its core, DALEQ uses a datalog engine > (Soufflé) — the same kind of logic-based analysis engine used in systems > like CodeQL — to normalize and compare bytecode structures. Key features > include: > > - Bytecode normalization to reduce irrelevant build differences > - Semantic diffing that identifies and explains non-equivalent instructions > - Provenance tracking: For equivalent files, DALEQ shows how equivalence > was derived via datalog rules, for non-equivalent files, it provides > bytecode-level diffs > > DALEQ also verifies whether the underlying source code inputs are the same > (or at least equivalent, tolerating some variations in comments and > formatting) and includes integrations with existing tools like the standard > javap disassembler. It supports extensibility through a plugin system. > > Real-World Evaluation > > DALEQ builds on our earlier research into levels of binary equivalence. We > evaluated the tool using real-world .jar files from Oracle and Google, both > of whom independently rebuild Java packages from source. The results are > encouraging: DALEQ was able to classify 85–90% of .class files that were > not bitwise identical as still being semantically equivalent, with > supporting provenance. > > Learn More > > You can try out DALEQ now on GitHub: https://github.com/binaryeq/daleq/ > A detailed technical paper describing DALEQ and our evaluation: > https://arxiv.org/abs/2508.01530 > A technical paper describing the conceptual approach of levels of binary > equivalence: https://arxiv.org/abs/2410.08427 (to be presented at ICSME’25 > <https://conf.researchr.org/home/icsme-2025>) > > > Jens Dietrich (Associate Professor at Victoria University of Wellington) > > Behnaz Hassanshahi (Principal Researcher and Tech Lead at Oracle, Oracle > Labs Brisbane) > > - > > > > > > > >
