Re: Produce identical packages for checksum comparison?

2009-11-15 Thread Chris

b. f. wrote:

Chris wrote:
  

I'm also thinking of building a simple checksum database to track what actually 
changes
and what my options were when I compiled it.  It would allow me to better make
regression decisions.  I could also be free to delete packages and know if I 
recompile
it later that it was the exact same package with the exact same options.  Very 
simple
script to do that.  Also if say there was an option when compiling ports to 
produce files
with specific time/dates it would be helpful in pinpointing which of my port 
branches
a specific file came from.



The elusive reproducible build.  Many people are interested in doing
this, and it's not as easy as it seems.  Even if you edited your
filesystem or archives to change the timestamps of package files, the
  

I think that could be accomplished though the port makefiles.

toolchain used to create the binary files in packages often injects
random seeds, timestamps, file paths, uid/gid information, etc. that
  

I can understand file paths with debug info.  Timestamps?  Ok sure for a
timestamp file being generated during a make that auto increments version
numbers.  What would change about uid/gid?  I can't imagine why that
might be in the binaries.  As far as tar a simple utility could capture all
the owner and group info into a text file as strings and set files to 
neutral

values for uid/gid.  A hack for the fact that packages are using tar files.
Why would the build tools be injecting random numbers into binaries?
I'll look into it.


creates differences from one build to the next.  You may have to hack
several base system utilities, and then directly compare the checksums
of binaries in archives after unpacking, or use a more intelligent
comparison. See, for instance, one Japanese developer's attempt to do
this in NetBSD in order to create better quality control for a
commercial product:

http://mail-index.netbsd.org/tech-toolchain/2009/02/17/msg000577.html

Your description of your system's problems sounds bad.  I think you
should concentrate on fixing those first.
What can I say?  I multitask.  If I concentrated on one problem at a 
time I would
never get anything done.   For my systems problem I think I'm going to 
have to
either abandon jails or maybe try nfs instead of nullfs.  Otherwise I'll 
have to

learn the kernel code and how to debug the Freebsd kernel.

Thanks for the confirmation that I'm not the only one to think about it and
the link.  Enjoy the day.

Chris






___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Produce identical packages for checksum comparison?

2009-11-15 Thread b. f.
On 11/15/09, Chris christopher...@telting.org wrote:
 b. f. wrote:
 Chris wrote:

...

 Even if you edited your
 filesystem or archives to change the timestamps of package files, the

 I think that could be accomplished though the port makefiles.

I think that the exact reproduction of whole archives will be
problematic, unless you have a means of changing the ctime of the
binaries that have been built to a predetermined value.

 toolchain used to create the binary files in packages often injects
 random seeds, timestamps, file paths, uid/gid information, etc. that

 I can understand file paths with debug info.  Timestamps?  Ok sure for a
 timestamp file being generated during a make that auto increments version
 numbers.  What would change about uid/gid?  I can't imagine why that
 might be in the binaries.

ar(1) and some of the other utilities inject this information into
certain binary files.  Try running 'objdump -a'  on, for example,
some static archive like /usr/lib/libc.a.  Of course this information
can be manipulated, but you have to do it.  See the patches in the
link I cited earlier for other examples.

...

 Why would the build tools be injecting random numbers into binaries?

Usually to provide some degree of uniqueness.  I'm not saying that it
is always done, just that it _may_ be done.  See, for example, the gcc
sources or the -frandom-seed option description in gcc(1).  And it may
not be just the compiler toolchain -- a port may do it.

Occasionally, there are other sources of non-determinism.  For
example, in a recent thesis, a researcher who was trying to use
reproducible builds to defeat a longstanding security threat found
that the tcc compiler produced non-deterministic builds because of a
defect in sign-extending some casts, and a problem with long double
output.  He also cited another researcher's finding that a certain
java compiler's output was dependent upon the address of heap memory
addresses used during compilation.  See:

http://www.dwheeler.com/trusting-trust/dissertation/wheeler-trusting-trust-ddc.pdf

...

If I concentrated on one problem at a  time I would never get anything done.

?! :)


b.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Produce identical packages for checksum comparison?

2009-11-14 Thread Chris
I have a somewhat flaky system.  I would like to compile ports to 
packages multiple times and do a file comparison.  Since packages are 
tar files they wouldn't match for sure just because of the different 
time attributes.  There may be other differences.  Anyone know how to 
generate packages with consistent checksums?


Chris

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Produce identical packages for checksum comparison?

2009-11-14 Thread Matthias Apitz
El día Saturday, November 14, 2009 a las 07:51:17AM -0800, Chris escribió:

 I have a somewhat flaky system.  I would like to compile ports to 
 packages multiple times and do a file comparison. ...

Hi Chris,

What is behind the idea to compile and pack a given port twice if there
are no errors during the build?

matthias

-- 
Matthias Apitz
t +49-89-61308 351 - f +49-89-61308 399 - m +49-170-4527211
e g...@unixarea.de - w http://www.unixarea.de/
Vote NO to EU The Lisbon Treaty: http://www.no-means-no.eu
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Produce identical packages for checksum comparison?

2009-11-14 Thread Chris

Matthias Apitz wrote:

El día Saturday, November 14, 2009 a las 07:51:17AM -0800, Chris escribió:

  
I have a somewhat flaky system.  I would like to compile ports to 
packages multiple times and do a file comparison. ...



Hi Chris,

What is behind the idea to compile and pack a given port twice if there
are no errors during the build?

matthias

  
While I don't think there will be differences I won't know until I do 
it.  Call it reassurance.

To me it seems like a good stress test.

Also every time I update my ports tree I don't know what is going to 
break.  I have a
jail running all the time to recompile my ports as they are updated.  I 
maintain between
three to five different different ports/packages branches of different 
checkout dates.


The system is somewhat flaky and crashes sometimes.  I play with a lot 
of stuff
and am actually using Freebsd as my desktop.  I am sure that most of my 
crashing is
due to multiple jails and using nullfs and unionfs but that isn't 
relevent to my current post.


I'm also thinking of building a simple checksum database to track what 
actually changes
and what my options were when I compiled it.  It would allow me to 
better make
regression decisions.  I could also be free to delete packages and know 
if I recompile
it later that it was the exact same package with the exact same 
options.  Very simple
script to do that.  Also if say there was an option when compiling ports 
to produce files
with specific time/dates it would be helpful in pinpointing which of my 
port branches

a specific file came from.

Chris


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Produce identical packages for checksum comparison?

2009-11-14 Thread b. f.
Chris wrote:
I'm also thinking of building a simple checksum database to track what 
actually changes
and what my options were when I compiled it.  It would allow me to better make
regression decisions.  I could also be free to delete packages and know if I 
recompile
it later that it was the exact same package with the exact same options.  Very 
simple
script to do that.  Also if say there was an option when compiling ports to 
produce files
with specific time/dates it would be helpful in pinpointing which of my port 
branches
a specific file came from.

The elusive reproducible build.  Many people are interested in doing
this, and it's not as easy as it seems.  Even if you edited your
filesystem or archives to change the timestamps of package files, the
toolchain used to create the binary files in packages often injects
random seeds, timestamps, file paths, uid/gid information, etc. that
creates differences from one build to the next.  You may have to hack
several base system utilities, and then directly compare the checksums
of binaries in archives after unpacking, or use a more intelligent
comparison. See, for instance, one Japanese developer's attempt to do
this in NetBSD in order to create better quality control for a
commercial product:

http://mail-index.netbsd.org/tech-toolchain/2009/02/17/msg000577.html

Your description of your system's problems sounds bad.  I think you
should concentrate on fixing those first.

b.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org