Will Sargent <will.sarg...@gmail.com> writes: > Basically, the only way I know of to securely check the goodness of java > serialization is to check the class name. I have no real idea if that can > be faked or not (I wouldn't be surprised), but anything involving any kind > of internal query into the structure of a message seems inherently > doomed.
I, too, read the Foxglove blog post you reference. [1] I find it relatively troubling/telling that the term "parsing" isn't used anywhere and that the message seems to be "don't deserialize untrusted data". Deserialization is parsing. In the author's own words: > Most programming languages provide built-in ways for users to output > application data to disk or stream it over the network. The process of > converting application data to another format (usually binary) suitable > for transportation is called serialization. The process of reading data > back in after it has been serialized is called unserialization. Clearly, given some input, parsing is exactly what one should be doing with it, "untrusted" or not. The issue is that Java deserialization maps onto the set of objects that are of *any* serializable class. The example code then goes: MyObject objectFromDisk = (MyObject)ois.readObject(); As evidenced by the cast, the intended input language is the set of serializations of objects of class MyObject. Unfortunately, this is validated only after parsing. As you note, this is a spectacular counter-example to the Langsec war cry of "full recognition before processing". What the above should really read is: MyObject objectFromDisk = ois.readObject(MyObject); I'm not sure if that's valid Java, but the point is that the parser should be restricted to the expected class. Obviously, whitelisting is still much better than nothing. I find two things interesting in this issue: 1. It props up a suspicion I've long held of data formats that purport to be "self-describing": If you know what you are parsing you don't need the data to describe itself. If on the other hand you don't know what you're parsing and invent some way for the data (i.e. the adversary) to tell you, you are doing nothing else than building a language backdoor. 2. It shows us that (MyObject)ois.readObject() is objectively bad API design. An obvious alternative would be MyObject.readObject(is) but I suspect many a Java designer would wave it off as naive and impractical for one reason or another. My intuition in contrast is that a "naive" formulation not being practical often indicates flaws elsewhere. -SMH _______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss