Hello,
This is a preannounce of Apache::ConfigParser. I wrote this to
allow programs separate from Apache to completely understand,
parse and manipulate Apache configuration files.
The interface is not simple, but it allows for more complicated
understanding of log files, such as finding the associated ServerName
for log files.
There are two separate modules described here. The first manages
a single directive and the second assembles these into an object
that represents a complete configuration file.
Comments welcome, including the name of the module.
It's available now at
http://www.orcaware.com/perl/Apache-ConfigParser-0.01.tar.gz
and will be up on CPAN if there are no serious comments.
Regards,
Blair
NAME
Apache::ConfigParser::Directive - An Apache directive or start
context
SYNOPSIS
use Apache::ConfigParser::Directive;
# Create a new emtpy directive.
my $d = Apache::ConfigParser::Directive->new;
# Make it a ServerRoot directive.
# ServerRoot /etc/httpd
$d->name('ServerRoot');
$d->value('/etc/httpd');
# A more complicated directive. Value automatically splits the
# argument into separate elements. It treats elements in "'s
as a
# single ement.
# LogFormat "%h %l %u %t \"%r\" %>s %b" common
$d->name('LogFormat');
$d->value('"%h %l %u %t \"%r\" %>s %b" common');
# Get a string form of the name.
# Prints `logformat'.
print $d->name, "\n";
# Get a string form of the value.
# Prints `"%h %l %u %t \"%r\" %>s %b" common'.
print $d->value, "\n";
# Get the values separated into individual elements.
Whitespace
# separated elements that are enclosed in "'s are treated as a
# single element. Protected quotes, \", are honored to not
begin or
# end a value element. In this form protected "'s, \", are no
# longer protected.
my @value = $d->get_value_array;
scalar @value == 2; # There are two elements in this
array.
$value[0] eq '%h %l %u %t \"%r\" %>s %b';
$value[1] eq 'common';
# The array form can also be set. Change style of LogFormat
from a
# common to a referer style log.
$d->set_value_array('%{Referer}i -> %U', 'referer');
# This is equivalent.
$d->value('"%{Referer}i -> %U" referer');
# There are also an equivalent pair of values that are called
# `original' that can be accessed via orig_value,
# get_orig_value_array and set_orig_value_array.
$d->orig_value('"%{User-agent}i" agent');
$d->set_orig_value_array('%{User-agent}i', 'agent');
@value = $d->get_orig_value_array;
scalar @value == 2; # There are two elements in this
array.
$value[0] eq '%{User-agent}i';
$value[1] eq 'agent';
# You can set undef values for the strings.
$d->value(undef);
DESCRIPTION
The "Apache::ConfigParser::Directive" module is a subclass
of "Tree::DAG_Node", which provides methods to represents
nodes in a tree. Each node is a single Apache configura
tion directive or root node for a context, such as <Direc
tory> or <VirtualHost>. All of the methods in that module
are available here. This module adds some additional
methods that make it easier to represent Apache directives
and contexts.
This module holds a directive or context:
name
value in string form
value in array form
a separate value termed `original' in string form
a separate value termed `original' in array form
the filename where the directive was set
the line number in the filename where the directive was set
The `original' value is separate from the non-`original'
value and the methods to operate on the two sets of values
have distinct names. The `original' value can be used to
store the original value of a directive while the
non-`directive' value can be a modified form, such as
changing the CustomLog filename to make it absolute. The
actual use of these two distinct values is up to the
caller as this module does not link the two in any way.
METHODS
The following methods are available:
$d = Apache::ConfigParser::Directive->new;
This creates a brand new "Apache::ConfigParser::Direc
tive" object.
It is not recommended to pass any arguments to "new"
to set the internal state and instead use the follow
ing methods.
There actually is no "new" method in the "Apache::Con
figParser::Directive" module. Instead, due to
"Apache::ConfigParser::Directive" being a subclass of
"Tree::DAG_Node", "Tree::DAG_Node::new" will be used.
$d->name
$d->name($name)
In the first form get the directive or context's name.
In the second form set the new name of the directive
or context to the lowercase version of $name and
return the original name.
$d->value
$d->value($value)
In the first form get the directive's value in string
form. In the second form, return the previous direc
tive value in string form and set the new directive
value to $value. $value can be set to undef.
If the value is being set, then $value is saved so
another call to "value" will return $value. If $value
is defined, then $value is also parsed into an array
of elements that can be retrieved with the
"value_array_ref" or "get_value_array" methods. The
parser separates elements by whitespace, unless
whitespace separated elements are enclosed by "'s.
Protected quotes, \", are honored to not begin or end
a value element.
$d->orig_value
$d->orig_value($value)
Identical behavior as "value", except that this
applies to a the `original' value. Use
"orig_value_ref" or "get_orig_value_array" to get the
value elements.
$d->value_array_ref
$d->value_array_ref(\@array)
In the first form get a reference to the value array.
This can return an undefined value if an undefined
value was passed to "value" or an undefined reference
was passed to "value_array_ref". In the second form
"value_array_ref" sets the value array and value
string. Both forms of "value_array_ref" return the
original array reference.
If you modify the value array reference after getting
it and do not use "value_array_ref" "set_value_array"
to set the value, then the string returned from
"value" will not be consistent with the array.
$d->orig_value_array_ref
$d->orig_value_array_ref(\@array)
Identical behavior as "value_array_ref", except that
this applies to a the `original' value.
$d->get_value_array
Get the value array elements. If the value was set to
an undefined value using "value", then
"get_value_array" will return an empty list in a list
context, an undefined value in a scalar context, or
nothing in a void context.
$d->get_orig_value_array
This has the same behavior of "get_value_array" except
that it operates on the `original' value.
$d->set_value_array(@values)
Set the value array elements. If no elements are
passed in, then the value will be defined but empty
and a following call to "get_value_array" will return
an empty array.
After setting the value elements with this method, the
string returned from calling "value" is a concatena
tion of each of the elements so that the output could
be used for an Apache configuration file. If any ele
ments contain whitespace, then the "'s are placed
around the element as the element is being concate
nated into the value string and if any elements con
tain a " or a \, then a copy of the element is made
and the character is protected, i.e. \" or \\, and
then copied into the value string.
$d->set_orig_value_array(@values)
This has the same behavior as "set_value_array" except
that it operates on the `original' value, so to get a
string version, "orig_value".
$d->filename
$d->filename($filename)
In the first form get the filename where this paritic
ular directive or context appears. In the second form
set the new filename of the directive or context and
return the original filename.
$d->line_number
$d->line_number($line_number)
In the first form get the line number where the direc
tive or context appears in a filename. In the second
form set the new line number of the directive or con
text and return the original line number.
SEE ALSO
the Apache::ConfigParser::Directive manpage and the
Tree::DAG_Node manpage.
NAME
Apache::ConfigParser - Load Apache configuration files
SYNOPSIS
use Apache::ConfigParser;
# Create a new empty parser.
my $c1 = Apache::ConfigParser->new;
# Create a new parser and load a specific configuration file.
my $c2 =
Apache::ConfigParser->new('/etc/httpd/conf/httpd.conf');
# Load a configuration file explicitly.
$c1->parse_file('/etc/httpd/conf/httpd.conf');
# Get the root of a tree that represents the configuration
file.
# This is an Apache::ConfigParser::Directive object.
my $root = $c1->root;
# Get all of the directives and starting of context's.
my @directives = $root->daughters;
# Get the first directive's name.
my $d_name = $directives[0]->name;
# This directive appeared in this file, which may be in an
Include'd file.
my $d_filename = $directives[0]->filename;
# And it begins on this line number.
my $d_line_number = $directives[0]->line_number;
# Find all the CustomLog entries, regardless of context.
my @custom_logs =
$c1->find_at_and_down_option_names('CustomLog');
# Get the first CustomLog.
my $custom_log = $custom_logs[0];
# Get the value in string form.
$custom_log_args = $custom_log->value;
# Get the value in array form already split.
my @custom_log_args = $custom_log->get_value_array;
# Get the same array but a reference to it.
my $customer_log_args = $custom_log->value_array_ref;
# The first value in a CustomLog is the filename of the log.
my $custom_log_file = $custom_log_args->[0];
# Get the original value before the path has been made
absolute.
@custom_log_args = $custom_log->get_orig_value_array;
$customer_log_file = $custom_log_args[0];
DESCRIPTION
The "Apache::ConfigParser" module is used to load an
Apache configuration file to allow programs to determine
Apache's configuration options. The resulting object con
tains a tree based structure using the "Apache::Config
Parser::Directive" class, which is a subclass of
"Tree::DAG_node", so all of the methods that enable tree
based searches and modifications. The tree structure is
used to represent the ability to nest sections, such as
<VirtualHost>, <Directory>, etc.
Apache does a great job of checking Apache configuration
files for errors and this modules leaves most of that to
Apache. This module does minimal configuration file
checking. The module currently checks for:
Start and end context names match
The module checks if the start and end context names
match. If the end context name does not match the
start context name, then it is ignored. The module
does not even check if the configuration options mod
ules have valid names.
PARSING
Notes regarding parsing of configuration files.
Line continuation is treated exactly as Apache 1.3.20.
Line continuation occurs only when the line ends in
[^\\]\\\r?\n. If the line ends in two \'s, then it will
replace the two \'s with one \ and not continue the line.
EXPORTED VARIABLES
The following variables are exported via @EXPORT_OK.
%directive_takes_rel_path
This hash is keyed by the lowercase version of a
directive name. The hash value is a subroutine refer
ence. If a hash entry exists for a particular entry,
then the directive name can take a relative path that
may need to be made absolute. The subroutine takes a
single variable which should be the potential file
path entry and it returns 1 if the potential filename
is a valid filename that can be made absolute, 0 oth
erwise.
For example, ErrorLog can take a filename, a piped
command or a syslog:* entry. The particular subrou
tine for ErrorLog checks if the value is a filename.
On Windows, these subroutines return 0 if the value is
'nul'.
These subroutines do not remove any "'s before check
ing on the type of value.
This is a list of directives and any special values to
check for as of Apache 1.3.20.
AccessConfig
AuthGroupFile
AuthUserFile
CookieLog
CustomLog check for "| command"
ErrorLog check for "| command" or syslog:
Include
LoadFile
LoadModule
LockFile
MimeMagicFile
PidFile
RefererLog check for "| command"
ResourceConf
ScoreBoardFile
ScriptLog
TransferLog check for "| command"
TypesConfig
METHODS
The following methods are available:
$c = Apache::ConfigParser->new
$c = Apache::ConfigParser->new({options})
$c = Apache::ConfigParser->new($filename)
$c = Apache::ConfigParser->new({options}, $filename)
Create a new "Apache::ConfigParser" object that stores
the content of an Apache configuration file. The
first optional argument is a reference to a hash that
contains options to new.
If $filename is given, then the contents of $filename
will be loaded. If $filename cannot be be opened then
$! will contain the error message for the failed
open() and new will returns an empty list in a list
content, an undefined value in a scalar context, or
nothing in a void context.
The currently recognized options are:
pre_transform_path_sub => sub { }
This allows the file or directory name for any
directive that is a filename or directory name to
be transformed by this subroutine before it is
made absolute with ServerRoot. This transforma
tion is applied to any of the directives that
appear in %directive_takes_rel_path.
The subroutine is passed the following arguments:
Apache::ConfigParser object
lowercase string of the configuration directive
the file or directory name to transform
post_transform_path_sub => sub { }
This allows the file or directory name for any
directive that is a filename or directory name to
be transformed by this subroutine after it is made
absolute with ServerRoot. This transformation is
applied to the same directives as pre_trans
form_path_sub.
The subroutine is passed the following arguments:
Apache::ConfigParser object
lowercase version of the configuration directive
the file or directory name to transform
One example of where the transformations is useful is
when the Apache configuration directory on one host is
NFS exported to another host and the remote host
parses the configuration file using "Apache::Config
Parser" and the paths to the access logs must be
transformed so that the remote host can properly find
them.
$c->DESTROY
There is an explicit DESTROY method for this class to
destroy the tree, since it has cyclical references.
$c->parse_file($filename)
This method takes a filename and adds it to the
already loaded configuration file inside the object.
If a previous Apache configuration file was loaded
either with new or parse_file and the configuration
file did not close all of its contexts, such as <Vir
tualHost>, then the new configuration options in
$filename will be added to the existing context. If
$filename could not be opened, then $! will contain
the reason for open's failure.
$c->root
Returns the root of the tree that represents the
Apache configuration file. Each object here is a
"Apache::ConfigParser::Directive".
$c->find_at_and_down_option_names('option', ...)
$c->find_at_and_down_option_names($node, 'option', ...)
In list context, returns the list all of $c options
that match the option names listed at the level of
$node and below. In scalar context, returns the num
ber of such options. The level here is in a tree
sense, not in the sense that some options appear $node
in the configuration file. If $node is given, then
the search is started at $node, includes $node and
searches $node's children. If $node is not passed,
then it starts at the top of the tree and searches the
whole configuration file.
All of the option names are made lowercase.
This is useful if you want to find all of the Custom
Log's in the configuration file:
my @logs = $c->find_at_and_down_option_names('CustomLog');
$c->find_in_siblings_option_names('option', ...)
$c->find_in_siblings_option_names($node, 'option', ...)
In list context, returns the list of all $c options
that match the option names at the same level of
$node, that is siblings of $node. In scalar context,
returns the number of such options. The level here is
in a tree sense, not in the sense that some options
appear $node in the configuration file. If $node is
not given or $node is the passed and it is "$c-"tree>,
then it will search through root's children.
All of the option names are made lowercase.
$c->find_in_siblings_and_up_option_names($node, 'option',
...)
In list context, returns the list of all $c options
that match the option names at the same level of
$node, that is siblings of $node, and above $node. In
scalar context, returns the number of such options.
The level here is in a tree sense, not in the sense
that some options appear $node in the configuration
file. In this method $node is a required option,
because it does not make sense to check the root node.
All of the option names are made lowercase.
This is useful when you find an option and you want to
find an associated option. For example, find all of
the CustomLog's and find the associated ServerName.
foreach my $log_node
($c->find_at_and_down_option_names('CustomLog')) {
my $log_filename = $log_node->name;
my @server_names =
$c->find_in_siblings_and_up_option_names($log_node);
my $server_name = $server_names[0];
print "ServerName for $log_filename is $server_name\n";
}
$c->dump
Return an array of lines that represents the internal
state of the tree.
SEE ALSO
the Apache::ConfigParser::Directive manpage and the
Tree::DAG_Node manpage.