Re: [cp-patches] more crypto fixes: JarFile

Marco Trudel Sun, 29 Oct 2006 02:19:54 -0800

Casey Marshall wrote:

Marco Trudel wrote:

Casey Marshall wrote:

Marco Trudel wrote:


<snip>

Back to the patch. Comments? Hints?

Your approach seems OK, but a hand-coded parser may still be faster (if
only done once for the whole manifest, and if it uses a buffered stream,
instead of reading a byte at a time) than using regexps for it.

I don't think implementing a hand-coded parser is the right thing to do
just because regexp is slow. This would need a lot of error-prone code
and actually, regexp is here to do exactly parsing stuff like that. So
instead of avoiding regexp because it's slow, it should be optimized
(IMHO)...


Sure. But you said yourself that you still thought the code was too
slow, and I worry a little that your code may be fragile.


I think it's too slow because of the crypto stuff. Parsing the manifest
once isn't that much overhead. I still have problem that my code might
be fragile, I see no more fragilness than we had before. Maybe you can
explain that a little bit more in detail (see below).

I do think it is a better approach than what we have now, though, so I
think it should go in.


I will try to take a look at the mainfest class next week. Maybe I can
spare a couple of hours and do it right. So please do not yet commit it.

I'd also
be wary of byte[]->String->byte[] conversions, because you are wrapping
each entry+hash in a String, then using String.getBytes() (what if the
default character set is not what you expect?). This byte representation
must match what's in the file exactly, because it is used to verify the
entry.

You did a String->byte[] conversion:
    byte[] target = ("Name: " + entry.getName()).getBytes();

So only the byte[]->String conversion is new and should be checked
(assuming you checked your String->byte[] conversion). From the
ByteArrayOutputStream class:

toString()
    Converts the buffer's contents into a string, translating bytes into
characters according to the platform's default character encoding.

I'm no expert in different charsets. But for me, it looks like it's
doing the same as the String.getBytes() method. Thus I think this is ok.

Or shall we verify the complete conversion part? As I said, I'm not an
expert in default character encodings. So there might be a problem (that
was already there before my patch).


No. The bytes used in the signature verification use the toByteArray
method of ByteArrayOutputStream. The code may mistakenly not match the
Name: line because it uses String.getBytes().


Yes, you create a byte[] by calling byteArrayOutputStream.toByteArray().
But later, you create the "Name: " entry this way:
        byte[] target = ("Name: " + entry.getName()).getBytes();

So you rely on the default byte representation as well. That's the only
reason why I dared to do that too. Or do I miss something here?

It may make sense instead to change the Manifest class to retain the
byte-level representation of each entry, along with the hash value,
instead of re-parsing the manifest like this -- having two parsers for
the same thing is always trouble.

That would really be the best solution. But I'm not interested in
reimplementing the whole signing part. I just wanted to add support for
entries with line breaks and run into the speed problem.


That doesn't require reimplementing much, just changing the Manifest
parsing code to hold on to the byte representation of each entry, so
those bytes could be used later by the verifier, instead of parsing the
manifest again.


As I said, I will try to look at it next week.

I'm not sure why I didn't try using the existing Manifest class when I
wrote the signed Jar stuff originally.

I don't know neither ;-)


Are you being mean? I don't much appreciate that.


No, why should I? I was meaning that if you don't know why you
implemented it that way, how should I know? I wasn't involved in the
implementation process...


Marco

Re: [cp-patches] more crypto fixes: JarFile

Reply via email to