Why YAML. (was Re: [Module::Build] Re: How to indicate a dependency in my module)
First, I'd like to address people's concern over the format of the META file. Module users and 99% of module authors have nothing to be concerned about. Most folks shouldn't even know the thing exists. Module::Build has been generating and using META.yml since nearly the beginning. MakeMaker has been generating META.yml automaticly for its authors since last July. CPAN is now full of META.yml files. 5.8.2 comes with a META.yml file. If you're just now noticing META.yml then we've done our job. It could have been written in esperanto with FoxPro for all it matters to the end user. If we do our job right, most people should never have to directly read or write a META.yml file. Its generated automaticly by MakeMaker and Module::Build and module tools (PAUSE, search.cpan.org, CPANPLUS, etc...) use it without your intervention. The only people who should be concerned are authors of these tools and folks involved in gethering CPAN statistics. Even module authors need not know about it as its automaticly generated when the run make dist or Build dist. I say this because every few months someone notices META.yml and asks Why did we use YAML and not X? which starts the same debate all over again with the same answers. We need a FAQ. To address the I don't want to learn another data format, why can't we just use Perl? issue. With YAML, you don't have to learn the data format. Let's look at how you'd generate a META file with Data::Dumper. use Data::Dumper; $Data::Dumper::Terse = 1; my $meta = { name = Foo::Bar, version = 1.23 }; open(META, META.perl); print META Dumper($meta); close META; And the equivalent with YAML.pm. use YAML qw(DumpFile); my $meta = { name = Foo::Bar, version = 1.23 }; DumpFile(META.yml, $meta); And now reading back in the Perl version. my $meta = do 'META.perl'; print $meta-{version}; And the YAML version. use YAML qw(LoadFile); my $meta = LoadFile(META.yml); print $meta-{version}; YAML's data model is so similar to Perl's (hashes, scalars, lists) that the data you put in and take out is almost indistinguishable from Data::Dumper. All that's different is the transient storage medium. So even if you wanted to roll your own META.yml file you never have to learn YAML! The only prereq is to have YAML.pm installed. And, finally, because I know someone's going to ask about the YAML.pm prereq, YAML.pm is not required by MakeMaker. It generates META.yml by hand and doesn't use it for anything. For Module::Build its an optional prereq. I hope that covers the bases. Now to address Phil's specific concerns. On Wed, Nov 12, 2003 at 10:25:00AM -0500, [EMAIL PROTECTED] wrote: OK, maybe I'm missing a LOT of context here, 'cause I haven't been agressively keeping up with this mailing list, but the security hole argument seems a bit odd. These META.yml files we're refering to -- these are meta data for managing the build process, files that will be distributed along with the tarballs we upload to CPAN, right? Module::Build might use META.yml a little in its build process, but its not a requirement. MakeMaker can't even read META.yml. Its not META.yml's primary purpose to be used to build a module. Its just turning out to be really damned useful. :) The primary purpose of META.yml is to supply module meta data that we'd normally have to go crawling around in the code to get. Like $VERSION (to get this you have to eval a line of the module) and its prerequisites (to get that you have to run the Makefile.PL and parse the resulting Makefile). It also contains intangables like what license the code is distributed under that you'd normally have to go groping around in the POD to try and figure out. File lists in META.yml are currently used as hints to the PAUSE indexer so it can better determine what to index. For example, 5.8.2's META.yml contains a listing private (which I think is changing to noindex) to tell PAUSE not to index these files/directories in the module list. In this case its because these are dual-life modules that also have CPAN versions. So, if I understand this correctly, you're worried about the build process eval'ing the contents of a file I sent you. Hmm. No, the case where security enters is when someone is grab CPAN meta information. For example, PAUSE, CPAN search sites, anyone trying to do CPAN statistics, anyone trying to determine the prerequisites of a module, etc... CPAN statistics gathering currently involves running large amounts of untrusted code. MakeMaker's parse_version() literally pulls a line of code out of the module and evals it. To determine the prerequisties you have to run the Makefile.PL which, as Mark-Jason Dominus once demonstrated with Memoize (I think it was) could contain rm -rf /. Of course, when you build and install a module you are trusting that it won't do anything
Re: [Module::Build] Re: How to indicate a dependency in my module
On Wed, 12 Nov 2003, Sam Vilain wrote: On Tue, 11 Nov 2003 02:29, Michael G Schwern wrote; YAML was chosen because its human readable and writable, its data ^ ^ So long as you're a FREAK who likes INDENTING and WHITESPACE to signify STRUCTURE. Is it any surprise that YAML is supported by PYTHON?! /topicalButTechnicallyVoidRantIgnoringTheObviousReplyForComicalValue Considering the number of ugly languages that have been spawned from or inspired by Perl, YAML may turn out to be one of the most respectable in the long run. At least it's not procedural like the other ugly children - PHP and Python. -- /chris X Windows is to memory as Ronald Reagan was to money. -- Unix-Haters Handbook
Re: [Module::Build] Re: How to indicate a dependency in my module
OK, maybe I'm missing a LOT of context here, 'cause I haven't been agressively keeping up with this mailing list, but the security hole argument seems a bit odd. These META.yml files we're refering to -- these are meta data for managing the build process, files that will be distributed along with the tarballs we upload to CPAN, right? So, if I understand this correctly, you're worried about the build process eval'ing the contents of a file I sent you. Hmm. OK, why is that anymore of a concern for eval'ing the perl module I also distributed, too? Isn't that just as big a security hole? Or how about the test suites? If I want to deploy malicious code in my CPAN upload, I can just drop the evil code into my test suite, and when you type make tes, I take over your world. Am I missing something? Because I also am loath to accept yet another file format, personally, and I would also prefer to keep my configuration data written in perl as well. And if we're talking about the perl build process, what's wrong with a perl-specific configuration mechanism? Michael == Michael G Schwern Michael writes: Michael In a nutshell: eval()ing the Perl structure back in is a major security hole. Michael Part of the point of META.yml is to avoid having to run any foreign code to Michael figure out module meta information. Michael To review (maybe this should be in a FAQ somewhere). Michael Data::Dumper/Perl code - Insecure (you have to eval it). Perl specific. Michael Storable - Not human readable. Format changes slightly from version to Michael version. Perl specific. Michael XML - Overkill. Ugly. Requires translation between Perl data Michael model (hashes, lists, scalars) and XML's (trees). Michael Difficult to read and write by humans. Michael YAML was chosen because its human readable and writable, its data Michael structures closely match those of Perl (ie. scalars, hashes and arrays), Michael it can be read without being eval'd, executable code cannot be hidden in Michael it and, as a bonus, its not Perl specific. Michael YAML's basic formatting is a structure we're already familiar with and tend Michael to use when writing ad-hoc data structures (ie. key: value). Michael Indentation as structure we're already more than comfortable with (ie. Michael indented source code) so readers of YAML should have no problem. Michael The less obvious features of YAML shouldn't be necessary for most META.yml Michael files. Michael Because YAML's data model closely matches that of Perl, writers of META.yml Michael simply need to construct a mirroring Perl structure and let YAML dump it Michael out. Its the closest thing to Data::Dumper evaling available. Michael -- Michael Michael G Schwern[EMAIL PROTECTED] http://www.pobox.com/~schwern/ Michael I'll tell you what beats voodoo every time, a big ass knife. Michael -- Overkill Battlebot driver
Re: [Module::Build] Re: How to indicate a dependency in my module
Chris == Christopher Hicks [EMAIL PROTECTED] writes: So, if I understand this correctly, you're worried about the build process eval'ing the contents of a file I sent you. Hmm. OK, why is that anymore of a concern for eval'ing the perl module I also distributed, too? Isn't that just as big a security hole? Chris There are any number of reasons I may be interested in Chris examining your module for extracting documentation and Chris dependancies without actually running a single line of your Chris code. The safer we make this sort of thing the more things Chris like search.cpan.org and various more recent alternatives we'll Chris have. Ok, that makes sense... Thanks. I hadn't really though about using this data outside of the build process.
Re: [Module::Build] Re: How to indicate a dependency in my module
On Wed, Nov 12, 2003 at 04:11:45PM +, Sam Vilain wrote: On Tue, 11 Nov 2003 02:29, Michael G Schwern wrote; YAML was chosen because its human readable and writable, its data ^ ^ So long as you're a FREAK who likes INDENTING and WHITESPACE to signify STRUCTURE. Is it any surprise that YAML is supported by PYTHON?! no, not at ALL but GIVEN that XS also treats INDENTING and WHITESPACE significantly, it is STRANGE that there are no YAML modules for perl written in XS. [although I believe that Ingy is working on something] Nicholas Clark
Re: [Module::Build] Re: How to indicate a dependency in my module
On Mon, Nov 10, 2003 at 05:42:03PM -0800, Terrence Brannon wrote: Thinking more about this, I guess META.yml would need to provide a little more info to a configure module. Would something like the following work? It's probably too late, but I am not keen on YAML. What is wrong with pure Perl configuration information? In a nutshell: eval()ing the Perl structure back in is a major security hole. Part of the point of META.yml is to avoid having to run any foreign code to figure out module meta information. To review (maybe this should be in a FAQ somewhere). Data::Dumper/Perl code - Insecure (you have to eval it). Perl specific. Storable - Not human readable. Format changes slightly from version to version. Perl specific. XML - Overkill. Ugly. Requires translation between Perl data model (hashes, lists, scalars) and XML's (trees). Difficult to read and write by humans. YAML was chosen because its human readable and writable, its data structures closely match those of Perl (ie. scalars, hashes and arrays), it can be read without being eval'd, executable code cannot be hidden in it and, as a bonus, its not Perl specific. YAML's basic formatting is a structure we're already familiar with and tend to use when writing ad-hoc data structures (ie. key: value). Indentation as structure we're already more than comfortable with (ie. indented source code) so readers of YAML should have no problem. The less obvious features of YAML shouldn't be necessary for most META.yml files. Because YAML's data model closely matches that of Perl, writers of META.yml simply need to construct a mirroring Perl structure and let YAML dump it out. Its the closest thing to Data::Dumper evaling available. -- Michael G Schwern[EMAIL PROTECTED] http://www.pobox.com/~schwern/ I'll tell you what beats voodoo every time, a big ass knife. -- Overkill Battlebot driver