Why YAML. (was Re: [Module::Build] Re: How to indicate a dependency in my module)

2003-11-13 Thread Michael G Schwern
First, I'd like to address people's concern over the format of the META 
file.  Module users and 99% of module authors have nothing to be concerned
about.   Most folks shouldn't even know the thing exists.

Module::Build has been generating and using META.yml since nearly the 
beginning.  MakeMaker has been generating META.yml automaticly for its 
authors since last July.  CPAN is now full of META.yml files.  5.8.2 comes
with a META.yml file.  If you're just now noticing META.yml then we've done 
our job.  It could have been written in esperanto with FoxPro for all it 
matters to the end user.

If we do our job right, most people should never have to directly read or
write a META.yml file.  Its generated automaticly by MakeMaker and 
Module::Build and module tools (PAUSE, search.cpan.org, CPANPLUS, etc...)
use it without your intervention.

The only people who should be concerned are authors of these tools and folks
involved in gethering CPAN statistics.  Even module authors need not know
about it as its automaticly generated when the run make dist or Build dist.

I say this because every few months someone notices META.yml and asks Why 
did we use YAML and not X? which starts the same debate all over again with 
the same answers.  We need a FAQ.


To address the I don't want to learn another data format, why can't we
just use Perl? issue.  With YAML, you don't have to learn the data format.
Let's look at how you'd generate a META file with Data::Dumper.

use Data::Dumper;
$Data::Dumper::Terse = 1;
my $meta = { name = Foo::Bar, version = 1.23 };
open(META, META.perl);
print META Dumper($meta);
close META;

And the equivalent with YAML.pm.

use YAML qw(DumpFile);
my $meta = { name = Foo::Bar, version = 1.23 };
DumpFile(META.yml, $meta);

And now reading back in the Perl version.

my $meta = do 'META.perl';
print $meta-{version};

And the YAML version.

use YAML qw(LoadFile);
my $meta = LoadFile(META.yml);
print $meta-{version};

YAML's data model is so similar to Perl's (hashes, scalars, lists) that the
data you put in and take out is almost indistinguishable from Data::Dumper.
All that's different is the transient storage medium.  So even if you wanted
to roll your own META.yml file you never have to learn YAML!  The only
prereq is to have YAML.pm installed.

And, finally, because I know someone's going to ask about the YAML.pm
prereq, YAML.pm is not required by MakeMaker.  It generates META.yml by
hand and doesn't use it for anything.  For Module::Build its an optional 
prereq.

I hope that covers the bases.


Now to address Phil's specific concerns.


On Wed, Nov 12, 2003 at 10:25:00AM -0500, [EMAIL PROTECTED] wrote:
 OK, maybe I'm missing a LOT of context here, 'cause I haven't been
 agressively keeping up with this mailing list, but the security hole
 argument seems a bit odd.
 
 These META.yml files we're refering to -- these are meta data for
 managing the build process, files that will be distributed along with
 the tarballs we upload to CPAN, right?  

Module::Build might use META.yml a little in its build process, but its not
a requirement.  MakeMaker can't even read META.yml.  Its not META.yml's 
primary purpose to be used to build a module.  Its just turning out
to be really damned useful. :)

The primary purpose of META.yml is to supply module meta data that we'd
normally have to go crawling around in the code to get.  Like $VERSION
(to get this you have to eval a line of the module) and its prerequisites
(to get that you have to run the Makefile.PL and parse the resulting 
Makefile).  

It also contains intangables like what license the code is distributed
under that you'd normally have to go groping around in the POD to try and
figure out.

File lists in META.yml are currently used as hints to the PAUSE indexer so
it can better determine what to index.  For example, 5.8.2's META.yml
contains a listing private (which I think is changing to noindex) to tell
PAUSE not to index these files/directories in the module list.  In this case
its because these are dual-life modules that also have CPAN versions.


 So, if I understand this correctly, you're worried about the build
 process eval'ing the contents of a file I sent you.  Hmm.

No, the case where security enters is when someone is grab CPAN meta 
information.  For example, PAUSE, CPAN search sites, anyone trying to do 
CPAN statistics, anyone trying to determine the prerequisites of a module, 
etc...  

CPAN statistics gathering currently involves running large amounts of 
untrusted code.  MakeMaker's parse_version() literally pulls a line of code 
out of the module and evals it.  To determine the prerequisties you have to 
run the Makefile.PL which, as Mark-Jason Dominus once demonstrated with
Memoize (I think it was) could contain rm -rf /.

Of course, when you build and install a module you are trusting that it
won't do anything 

Re: [Module::Build] Re: How to indicate a dependency in my module

2003-11-12 Thread Christopher Hicks
On Wed, 12 Nov 2003, Sam Vilain wrote:
 On Tue, 11 Nov 2003 02:29, Michael G Schwern wrote;
 
YAML was chosen because its human readable and writable, its data
 ^  ^
 So long as you're a FREAK who likes INDENTING and WHITESPACE to
 signify STRUCTURE.
 
 Is it any surprise that YAML is supported by PYTHON?!
 
 /topicalButTechnicallyVoidRantIgnoringTheObviousReplyForComicalValue

Considering the number of ugly languages that have been spawned from or
inspired by Perl, YAML may turn out to be one of the most respectable in
the long run.  At least it's not procedural like the other ugly children -
PHP and Python.

-- 
/chris

X Windows is to memory as Ronald Reagan was to money.
   -- Unix-Haters Handbook



Re: [Module::Build] Re: How to indicate a dependency in my module

2003-11-12 Thread Phil . Moore

OK, maybe I'm missing a LOT of context here, 'cause I haven't been
agressively keeping up with this mailing list, but the security hole
argument seems a bit odd.

These META.yml files we're refering to -- these are meta data for
managing the build process, files that will be distributed along with
the tarballs we upload to CPAN, right?  

So, if I understand this correctly, you're worried about the build
process eval'ing the contents of a file I sent you.  Hmm.

OK, why is that anymore of a concern for eval'ing the perl module I
also distributed, too?  Isn't that just as big a security hole?  

Or how about the test suites?  If I want to deploy malicious code in
my CPAN upload, I can just drop the evil code into my test suite, and
when you type make tes, I take over your world.

Am I missing something?  Because I also am loath to accept yet another
file format, personally, and I would also prefer to keep my
configuration data written in perl as well.

And if we're talking about the perl build process, what's wrong with a
perl-specific configuration mechanism?

 Michael == Michael G Schwern Michael writes:

Michael In a nutshell: eval()ing the Perl structure back in is a major security hole.
Michael Part of the point of META.yml is to avoid having to run any foreign code to
Michael figure out module meta information.

Michael To review (maybe this should be in a FAQ somewhere).

Michael Data::Dumper/Perl code - Insecure (you have to eval it).  Perl specific.
Michael Storable - Not human readable.  Format changes slightly from version to
Michael version.  Perl specific.
Michael XML  - Overkill.  Ugly.  Requires translation between Perl data
Michael model (hashes, lists, scalars) and XML's (trees).
Michael Difficult to read and write by humans.

Michael YAML was chosen because its human readable and writable, its data 
Michael structures closely match those of Perl (ie. scalars, hashes and arrays),
Michael it can be read without being eval'd, executable code cannot be hidden in
Michael it and, as a bonus, its not Perl specific.

Michael YAML's basic formatting is a structure we're already familiar with and tend
Michael to use when writing ad-hoc data structures (ie. key: value).
Michael Indentation as structure we're already more than comfortable with (ie. 
Michael indented source code) so readers of YAML should have no problem. 
Michael The less obvious features of YAML shouldn't be necessary for most META.yml
Michael files.

Michael Because YAML's data model closely matches that of Perl, writers of META.yml 
Michael simply need to construct a mirroring Perl structure and let YAML dump it
Michael out.  Its the closest thing to Data::Dumper evaling available.


Michael -- 
Michael Michael G Schwern[EMAIL PROTECTED]  http://www.pobox.com/~schwern/
Michael I'll tell you what beats voodoo every time, a big ass knife.
Michael -- Overkill Battlebot driver


Re: [Module::Build] Re: How to indicate a dependency in my module

2003-11-12 Thread Phil . Moore
 Chris == Christopher Hicks [EMAIL PROTECTED] writes:

 So, if I understand this correctly, you're worried about the build
 process eval'ing the contents of a file I sent you.  Hmm.
 
 OK, why is that anymore of a concern for eval'ing the perl module I
 also distributed, too?  Isn't that just as big a security hole?  

Chris There are any number of reasons I may be interested in
Chris examining your module for extracting documentation and
Chris dependancies without actually running a single line of your
Chris code.  The safer we make this sort of thing the more things
Chris like search.cpan.org and various more recent alternatives we'll
Chris have.

Ok, that makes sense... Thanks.  I hadn't really though about using
this data outside of the build process.


Re: [Module::Build] Re: How to indicate a dependency in my module

2003-11-12 Thread Nicholas Clark
On Wed, Nov 12, 2003 at 04:11:45PM +, Sam Vilain wrote:
 On Tue, 11 Nov 2003 02:29, Michael G Schwern wrote;
 
YAML was chosen because its human readable and writable, its data
 ^  ^
 So long as you're a FREAK who likes INDENTING and WHITESPACE to
 signify STRUCTURE.
 
 Is it any surprise that YAML is supported by PYTHON?!

no, not at ALL

but GIVEN that XS also treats INDENTING and WHITESPACE significantly,
it is STRANGE that there are no YAML modules for perl written in XS.

[although I believe that Ingy is working on something]

Nicholas Clark


Re: [Module::Build] Re: How to indicate a dependency in my module

2003-11-11 Thread Michael G Schwern
On Mon, Nov 10, 2003 at 05:42:03PM -0800, Terrence Brannon wrote:
 Thinking more about this, I guess META.yml would need to provide a 
 little more info to a configure module. Would something like the 
 following work?
 
 It's probably too late, but I am not keen on YAML. What is wrong with 
 pure Perl configuration information? 

In a nutshell: eval()ing the Perl structure back in is a major security hole.
Part of the point of META.yml is to avoid having to run any foreign code to
figure out module meta information.

To review (maybe this should be in a FAQ somewhere).

Data::Dumper/Perl code - Insecure (you have to eval it).  Perl specific.
Storable - Not human readable.  Format changes slightly from version to
   version.  Perl specific.
XML  - Overkill.  Ugly.  Requires translation between Perl data
   model (hashes, lists, scalars) and XML's (trees).
   Difficult to read and write by humans.

YAML was chosen because its human readable and writable, its data 
structures closely match those of Perl (ie. scalars, hashes and arrays),
it can be read without being eval'd, executable code cannot be hidden in
it and, as a bonus, its not Perl specific.

YAML's basic formatting is a structure we're already familiar with and tend
to use when writing ad-hoc data structures (ie. key: value).
Indentation as structure we're already more than comfortable with (ie. 
indented source code) so readers of YAML should have no problem. 
The less obvious features of YAML shouldn't be necessary for most META.yml
files.

Because YAML's data model closely matches that of Perl, writers of META.yml 
simply need to construct a mirroring Perl structure and let YAML dump it
out.  Its the closest thing to Data::Dumper evaling available.


-- 
Michael G Schwern[EMAIL PROTECTED]  http://www.pobox.com/~schwern/
I'll tell you what beats voodoo every time, a big ass knife.
-- Overkill Battlebot driver