An Essay on the REBOL Series
for the Busy Non-REBOL Programmer


THE SERIES PSEUDOTYPE

REBOL uses typed data, as opposed to typed variables.  A REBOL
word can refer to any legal REBOL value.  However, REBOL operations
(operators, primitive functions, mezzanine functions, and user-
defined functions) can check the types of operands/arguments for
compatibility and generate run-time errors if asked to operate on
inappropriately-typed values.  This checking process is streamlined
through the use of "pseudotypes", which correspond to sets of
concrete "datatypes" that exhibit common characteristics.

One of the most important pseudotypes in REBOL is "series",
which comprises any datatype that can be understood as a sequence
of values.  The series pseudotype is subdivided into the
"any-string" and "any-block" pseudotypes; the first for any value
that can be understood as a sequence of characters (such as
"string") and the second for sequences of REBOL values of any type.


TERMINOLOGY OF THE MODEL

The term "sequence" is used in this model as a name for the
collection of data underneath a series.  That term is NOT used in
REBOL, which provides no way to access a "naked sequence" except via
some series which refers to it.

This model uses the non-REBOL term "entity" to refer to a specific
data structure that represents a particular REBOL value.  This was
chosen instead of "object" or "instance", which have connotations
that are potentially distracting, both for experienced REBOL users
and for experienced OO programmers.

A REBOL series also requires the notion of "current position" within
the sequence.  REBOL documentation describes the behavior of several
functions (e.g., 'at, 'back, 'index?, 'insert, 'next, etc.) in terms
of the position of the series passed to them.  Consequently, having
a current position is a necessary attribute of any series.

Therefore, a series may be modeled as a compound entity with two
attributes: a reference to a sequence, and a position within
that sequence.

The underlying sequence may also be modeled as a compound entity;
it contains the actual data values and two additional attributes:
an "end" position (last position in use), and an "allocation size"
(last position which could be used without requiring additional
memory management activity).

This last attribute is nearly totally hidden from the REBOL source
level.  It only appears in connection with 'make, which allows space
to be reserved for expansion of the sequence underlying a newly-
created series value.  REBOL manages memory automatically, with no
further interaction with the user or script.  Therefore, this
attribute is ignored for the rest of this essay.

The REBOL documentation refers to "the end of a series" (e.g. in the
description of 'tail).  For this model, that phrase is viewed as
verbal shorthand for "the end position of the sequence referrenced
by the series".  This is a reasonable abbreviation, as a series is
refers to only one sequence, which has only one end position.

This model assumes that some form of internal reference (as opposed
to multiple copies of the details) is used for all REBOL values.
Although there are a variety of standard comp sci techniques which
could be used here (e.g., tag bits, pointers, descriptors, etc.)
those distinctions are below the level of interest to this model.

Enough generalities; let's do some examples!

(continued in essay/2)

Reply via email to