I've spent a few days putting together a little application that
generates a database of RFCs.  It uses XML::Parser to generate the
database, and Class::DBI and DBD::SQLite to store and search the data.

I've been calling it RFC::Index.  It possibly falls more under the
category of "application" than library, though the usage I had in mind
for it was as a supplemental tool for a few other projects I have in the
pipeline.

I've attached the POD for the two main modules.  My questions:  Is this
something that should/could go onto CPAN?  And, if so, is the name
appropriate?  I realize that there is no RFC toplevel namespace right
now, but that seems to be the most appropriate place for it.

(darren)

-- 
When it is dark enough, you can see the stars.
    -- Ralph Waldo Emerson
NAME
    RFC::Index - Create and maintain a local searchable RFC database.

SYNOPSIS
        use RFC::Index '/usr/local/share/rfc.db';

        my $rfc = RFC::Index->retrieve("RFC2822");
        my @rfc = RFC::Index->search("HTTP", "MIME");

DESCRIPTION
    "RFC::Index" is used to create and maintain a local searchable RFC
    database. It consists of three elements: a parser to parse the
    rfc-index.xml file (in the root of the RFC directory, see
    "ftp://ftp.rfc-editor.org/in-notes/rfc-index.xml";); a search frontend
    ("RFC::Index"); and a set of classes that implement the RFC dataelements
    ("RFC::Entry", "RFC::Author", "RFC::Keyword", and so on).

    "RFC::Index" uses "Class::DBI" and "DBD::SQLite" under the hood. It is
    possible that "RFC::Index" will work just fine with RDBMS other than
    SQLite, but I have not tested it with anything other than SQLite.

USAGE
    There are two main search methods available from the "RFC::Index" class:
    "retrieve", to retrieve an individual RFC by number, and "search", for
    searching for RFCs by keyword. "retrieve" returns an "RFC::Entry" object
    or undef, while "search" returns a (possibly empty) array of
    "RFC::Entry" objects.

    Import "RFC::Index" with the path the to database to use:

        use RFC::Index '/usr/local/share/rfc.db';

    or explicitly call "import" with the path to the database:

        use RFC::Index;
        RFC::Index->import('/usr/local/share/rfc.db');

    If you are using a DBD other than SQLite, you can pass any other
    arguments to "import"; they will be blindly passed on the "set_db" call:

        use RFC::Index
            'dbi:mysql:RFC:dbhost', 'rfcuser', 'rfcpass',
            { RaiseError => 0 };

    retrieve
        Call "retrieve" with the number of an RFC:

            my $rfc = RFC::Index->retrieve(2822);

        The number can optionally be prefixed with "RFC", which is how they
        are referred to in the RFC indexes:

            my $rfc = RFC::Index->retrieve("RFC2822");

    search
        The RFC index includes a number of keywords, and the "search" method
        provides a way to get RFCs based on these keywords. No stemming of
        search terms is performed, however, at least not at this time.

    To generate a new index, or update an existing index, use the "reindex"
    method of the "RFC::Index" class:

        use RFC::Index '/usr/local/share/rfc.db';
        use LWP::Simple;

        mirror "ftp//www.rfc-editor.org/rfc/rfc-index.xml" => "rfc-index.xml";
        RFC::Index->reindex("rfc-index.xml");

    The database specified to "RFC::Index::import" will be populated. This
    process may take a while, depending on the speed of your machine; on my
    lightly loaded 1GHz PIII (1G RAM) it takes about 15 minutes to run.
    Multiple runs do not create duplicate entries in the database; the
    parser is designed to be run on a regular basis, to keep the index up to
    date.

TODO / BUGS
    Support for BCP and STD types
        Currently Best Current Practice entries and Internet Standards are
        not supported. This is a bug of omission.

    Test on non-SQLite databases
        Theoretically, "RFC::Index" should run just fine on non-SQLites.

SUPPORT
    "RFC::Index" is supported by the author.

VERSION
    This is "RFC::Index", revision $Revision: 1.2 $.

AUTHOR
    darren chamberlain <[EMAIL PROTECTED]>

COPYRIGHT
    (C) 2004 darren chamberlain

    This library is free software; you may distribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    Perl, RFC::Entry, Class::DBI, DBD::SQLite, Set::Scalar

NAME
    RFC::Entry - An RFC

SYNOPSIS
        use RFC::Index;

        my $rfc822 = RFC::Index->retrieve("rfc822");
        my @mail_rfcs = RFC::Index->search("mail");

DESCRIPTION
    RFC searches using "RFC::Index" return instances of the "RFC::Entry"
    class. Each instance supports a number of Useful Methods, which provide
    access to data gleaned from the index.

    These Useful Methods include:

    title
        The title of the RFC.

    abstract
        A short abstract of the RFC, if it exists in the index.

    date
        The date of the RFC, as a "Time::Piece" instance.

    current_status, publication_status
        The status of the RFC.

    notes
        Any notes attached to the RFC.

    uri The URI of a version of the RFC. By default, this URI will be rooted
        at "ftp://ftp.rfc-editor.org/in-notes";, though a new base URI can be
        passed as an argument to "uri":

            print $rfc->uri("http://localhost/mirrors/rfc";);

    page_count
        The number of pages in the document.

    char_count
        The number of characters in the document.

    file_format
        The format of the document.

    authors
        Returns an array (or reference to an array, in scalar context) of
        "RFC::Author" objects. These objects have the following methods:

        name
            The name of the author, in the first initial-last name format
            used in the index.

        title
            The title of the author, as listed in the index.

        organization, org_abbrev
            The organization and its abbreviation, as listed in the index.

        An "RFC::Author" instance stringifies to the value of the "name"
        method:

            my @authors = $rfc->authors;
            my $last_author = pop @authors;
            my $authors = join " and ", join(", ", @authors), $last_author;
            print $rfc->doc_id, " was authored by $authors.\n";

    keywords
        Returns an array (or iterator) of keywords attached to the RFC.

    obsoletes, obsoleted_by
        Returns a list of other RFCs that obsolete or are obsoleted by the
        current RFC. For example, RFC822 obsoletes RFC733, and is obsoleted
        by RFC2822.

    updates, updated_by
        Returns a list of other RFCs that update or are updated by the
        current RFC.

    as_xml
        Call the "as_xml" method to have the entry returned as a string of
        XML. This method reconstucts the original index entry, down to the
        indentation.

TODO
    Non <rfc-entry> elements are not currently supported. This includes the
    Best Current Practices (*BCP*) entries and Standards.

SUPPORT
    "RFC::Entry" is supported by the author.

VERSION
    This is "RFC::Entry", revision $Revision: 1.3 $.

AUTHOR
    darren chamberlain <[EMAIL PROTECTED]>

COPYRIGHT
    (C) 2004 darren chamberlain

    This library is free software; you may distribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    Perl, RFC::Index, URI, Time::Piece

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to