On 4/26/2010 1:33 PM, Justin J Stark wrote:
The way habari handles filters and formatters is very bad.
Formatting for Habari was never anticipated to be as extensive a
proposition as what has been described. Filters are certainly in my
list of "Things Habari should have done differently." Nonetheless, here
are some random additional thoughts...
A more concrete description of what a new system should accomplish (a
real use case rather than anonymous "filter X" and "filter Y"
placeholders) would be most helpful for producing a practical solution.
It may be useful to retain some of the existing functionality such that
there are standard filter suffixes on which the theme can count. For
example:
$post->foo // raw content
$post->foo_filtered // content with filters applied
$post->foo_safe // raw content, filtered for xss
There is already a way to provide multiple filters via a single plugin,
although it is broken because of a missing Interface, FormatPlugin. If
your plugin class implements FormatPlugin, then the Format class will
allow any of its methods to be used as formatters. Maybe this isn't ideal.
Input and Output filters should be applied to any filtered content such
that an intermediate state is produced. The theme would apply the
output filters, and the content entry system would apply the input
filters. The result is that content could be re-purposed for a
different spec target on output; content could be forced (somehow) to be
valid when output in its target format, like XHTML.
There could also be a preliminary intermediate state for cached
pre-processing. For example, you might supply post content like this:
My cat has caught [catch_count] mice.
The input filter would convert the content to an intermediate cached
state and store it in the content_cached field (currently in our
database and unused). During this input filtering phase, the formatting
used by the user (markdown, textile, etc) is converted into a
machine-readable neutral format (one of: xml, html, serialized php
array, etc) and cached.
On the first output filter pass, the string [catch_count] would need to
be replaced by an actual number. This simulates things like latex,
heading-to-image font replacements, and the like. The content resulting
from this pass would still be in machine-readable neutral format.
On the second output filter pass, the theme would define what format is
required for output (xml, html, etc) and call formatters to convert the
neutral format to that format. It is imperative (and different from
other suggested implementations I've heard) that the *theme* define the
ultimate output format, not the user, otherwise the format may cause a
mismatch.
In each phase, a battery of converters could be applied to arrive at the
intermediate or final states. These should not be weighted (like in
Drupal or in Habari's priority model) but dependent, like Habari's stack
system. It may even be possible to re-use the Stack system to define
filters that should be applied.
One thing that we should try very hard to avoid is the Input Format
system in Drupal. It sounds like a good idea, but in practice it's a
piece of crap. Letting users choose what input format they want per
node is asking for trouble. There are aggravations in there beyond what
is readily apparent.
To explain a bit, a Drupal "format" consists of one or more "filters"
applied in a specific order. The order is determined by "weight", a
Drupal term synonymous with Habari's "priority".
Drupal allows you to configure Input Formats per role. So you can say
that an administrator can have access to a Raw HTML format, a Filtered
HTML format, and a WYSIWYG format. But an editor role could be
configured only to see the WYSIWYG format when editing content. If you
then create content as an administrator using the Raw HTML format, no
editor can edit it, because they do not have permission to use that
format. Debugging these issues is very tedious.
I would personally prefer to make this as simple as possible. Note that
having an intermediate format trivializes this process somewhat, since
we would only store that intermediate format, and then pull and push the
content from it. That is, if content is written and saved from
markdown, a user using textile would see textile, because the filter
would convert the intermediate format back to textile for them.
Anyway, choosing a format on the publishing page should be avoided, IMO.
Alright, I'm out of the time I allotted for this email. :-\
Owen
--
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at http://groups.google.com/group/habari-dev