Re: [langsec-discuss] Langsec and Java Object Serialization

travis+ml-langsec Mon, 18 Jan 2016 21:38:00 -0800

On Tue, Jan 05, 2016 at 09:42:23PM -0800, Will Sargent wrote:
> Although as far as I can tell, you should be running the JVM inside of
> Docker, inside of a VM, inside of AppArmor and seccomp (whatever that is),
> with a patched grsecurity kernel.  And CoreOS is involved somehow.
> 
> The temptation to call it the Turducken Security Model is strong.

:-)

The ideas behind concentric perimeter defenses are old:

http://www.exploring-castles.com/medieval_castle_defence.html
Watch "DOCUMENTARY FULL Medieval Castles and Sieges FULL DOCUMENTARY" on YouTube
https://youtu.be/MT-vmKCmukY
http://www.historynet.com/medieval-warfare-how-to-capture-a-castle-with-siegecraft.htm
http://history.howstuffworks.com/historical-figures/castle2.htm

I should point out that in those days, each defended layer would
attrit your attacking forces; nowadays, what you want are
heterogeneous concentric layers that require different skill sets
(vuln classes) to attrit the pool of possible successful attackers.

Some of the concepts are not so old:
http://www.ranum.com/security/computer_security/archives/usenix-login_3.pdf

Great talk:
http://www.sans.org/reading-room/whitepapers/detection/ids-burglar-alarms-how-to-guide-1324

Along similar lines, I wrote:

On Thu, Jun 11, 2015 at 09:14:12AM -0700, Dr. Ulrich Lang wrote:
> I think a (maybe?) similar level of assurance can be achieved much
cheaper through a software/system architecture where only small
parts of the code are critical to the assurance, and the impact of
errors is managed/contained. Micro-kernel architectures etc.,
isolation etc. are such examples. I know provability is hard for
such systems, but in practice I think this goes a long way for a
fraction of the cost.

I have been doing some threat modelling recently and wondering exactly
how to define what a "security boundary" is.

For example, I can think of:

1. MMU-enforced OS-level memory space (process) isolation
Requires IPC, scheduler delay, and a context switch, just a context
switch (L4 jumping IPC) or at least a TLB flush (mmap)

2. Hardware isolation (separate physical machines)
Requires serialization and network delays
Remote host could be compromised

3. Guest to Host (or G2G) VM isolation (e.g. WPAR)
Requires a trapped/emulated hardware call
May be escapable esp. because of buggy hardware emulation

4. Hardware Security Module (HSM)
Requires some kind of call, serial line, or a secured network comm
HSM can be fed bad data

5. "Safe" virtual machine isolation
Software is executed in a VM (e.g. JVM) that provides some reduced
functionality relative to native code
Native code to emulate VM can be buggy

6. Virtual machine JIT compiled native code
Same as #5 but with VM instructions inlined into native code

Somewhere in between 1 and 2 is IBM mainframe LPARs, where the
physical memory is separated between different running OSes by means
of hardware memory isolation (i.e. not every CPU gets access to every
stick of DRAM - each CPU gets access to a segment of RAM, and many
CPUs can share the same segment as when running a multicore OS).

And then there's correctness-as-defense, where provably correct code
has to be isolated from casual software by one of these mechanisms.

Did I miss anything? Oh, perhaps:
https://en.wikipedia.org/wiki/Sandbox_(computer_security)

Now, if anyone cares to muse as to where these boundaries are best
deployed in the overall system architecture, I am all ears.

Here's why I ask:

There are two ways of constructing a software design: One way
is to make it so simple that there are obviously no
deficiencies, and the other way is to make it so complicated
that there are no obvious deficiencies. The first method is far
more difficult. It demands the same skill, devotion, insight,
and even inspiration as the discovery of the simple physical
laws which underlie the complex phenomena of nature.
-- Tony Hoare

Case study:

The US Embassy in Moscow has a plexiglass room (known as "the bubble")
for sensitive conversations. This room is acoustically sealed and
isolated from the enclosing room. The idea behind this design is,
"bug sweeping is hard, the cornerstone of TSCM is the physical search,
let's make that as simple as possible". So with a plexiglass room,
bugsweeping involves looking around, something anyone can do with
zero training, and is done relatively automatically.

Now I don't know what computer security analogy could be that simple,
apart from airgapping, which isn't really that simple (TEMPEST).
However, there has to be places where lines can be drawn, enforced, or
at least monitored, and these lines must naturally fall at certain
junctures that are to some degree or another obvious to the trained
eye. For example, HSMs have a natural boundary at the layer where
data are encrypted.

Some twitter engineer had an interesting idea he advocated recently
[reference needed]. His idea was you always encode user data on the
way in, so that any unsafe use of the data is flagged in the code by a
HTML decode. His thinking went, this makes auditing code easy,
because no HTML decodes means no XSS. Now, there's a few problems
with this approach, not the least of which is the multiple output
contexts with different and incompatible encoding rules, but the
generalized ideal is kind of awesome: make safe code transparently
safe. Or at least, make dangerous code transparently dangerous.
This allows us (security engineers and code reviewers) to rule out
(declare benign) potentially large swaths of code.

There are certainly precedents to this, including:
1) static type checking, which avoids expensive run-time checks
2) Proof-carrying-code shifts the burden of proof onto the author/compiler.
3) Writing in a safe virtual machine instruction set, likewise.

All of these have small overheads built into development, used to
avoid a larger expense down the line.

=== New comments ===

Whereas most programmers compose things to ensure certain things
happen in certain cases, as appsec engineers, we need to ensure that
certain things do not happen in any case, and so our calculus is
roughly opposite that of the programmer, and we are in the difficult
position of proving universals (conventionally, "proving a negative").
By removing abilities from software (i.e. "this source file does not
use eval"), we can reason much more reliably about their capabilities
and tailor our reviews to relevant sections. If there are no
constraints, we must review all of the code. Unfortunately I do
not see such constraints added in at the language level, but I have
not researched Java sandboxes and such yet.

The problems which are most pernicious are those which do not operate
through a single, identifiable API, and so cannot be grepped for -
XSS, SQLi, integer vulns... are based on primitive operations mixing
tainted and untainted data, with many APIs (or implicit) for tainted data sinks.

It is much harder to notice something which is not there, but should
be, than something which is, but should not.
--
http://www.subspacefield.org/~travis/ | if spammer then j...@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977
_______________________________________________
langsec-discuss mailing list
langsec-discuss@mail.langsec.org
https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss

Re: [langsec-discuss] Langsec and Java Object Serialization

Reply via email to