Hi all,

I just went through the  whole list in attrs.html and tried to classify
each - only partially starting from Geoff's list. I find that (to explain
the attributes to myself) I'm basically coming up with classification into
groups and some subgroups.

Here goes - with a few examples for each group to illustrate what I mean:
* Indexing Control
        * where - determines which files are going to be indexed
                allow_virtual_hosts
                exclude_urls
                http_proxy_exclude
                limit_normalized
        * what - determines which parts of each file are going to be indexed
                allow_numbers
                keywords_meta_tag_names
        * how - determines how the matches are stored in the databases
                compression_level
                max_description_length
                remove_default_doc
        * timing - wait time, timeout
                server_wait_time
                timeout
        * out - information sent to the "outside world"
                maintainer
                robotstxt_name
                user_agent
* External Parsers
                external_parsers
                pdf_parser
* URLs - URLS matching and pattern replacement
                common_url_parts
* Extra output - optional extra output to be produced
        (split into indexing and searching?)
                create_url_list (indexing)
                doc_list (indexing)
                htnotify_sender (indexing)
                logging (searching)
* Searching Control
        * UI - user interface to the searching process
                allow_in_form
                method_names
                sort_names
        * Method - algorithms and methods to be used
                match_method
                max_prefix_matches
                prefix_match_character
* Search Presentation
        * how - algorithms and decisions for presenting the results
                add_anchors_to_excerpt
                excerpt_show_top
        * text - literal texts to be used in variables for the templates
                no_next_page_text
                no_page_number_text
        * files - files to be used as templates
                nothing_found_file
                search-results_footer
                start_blank
* Ranking Factors
        * indexing - ranking influenced during the indexing process
                description_factor
                heading_factor_1 - _6
        * searching - ranking influenced during the searching process
                backlink_factor
                date_factor
* Databases - databases used and their location
                common_dir
                database_base
                database_dir
                doc_index
                endings_affix_file

I'm still left with a few question marks, basically attributes that could
fall into two groups:
        bad_word_list
                (indexing control/what *and* searching control/method)
        iso_8601
                (search presentation/how *and* extra output)
        minimum_word_length
                (indexing control/what *and* searching control/UI)
        valid_punctuation
                (indexing control/how *and* searching control/UI)
Two groups isn't bad by and of itself - it just makes it hard to decide in
which separate file to put these attributes.

Comments?

At 14:09 1999-02-14 -0400, Geoff Hutchison wrote:
>
>I've been wondering about splitting the file into logical groups. This
>would let us break attrs.html into smaller, more comprehensible chunks. In
>addition, it would enable examples using multiple attributes as people
>might use in their config files.
>
>Here's some suggested breakdowns:
>* External Parsers -> Should be separate doc anyway.
>* Ranking Factors -> Make more sense in one doc, can give examples of multiple
>               rankings on a site changing the output.
>* Server Control -> max_hopcount, server_max_docs, limit_urls_to
>* Search Formatting -> Templates, Graphics, Result Sorting
>* Fuzzy Control -> search_algorithms, prefix_match_char
>* Filenames -> common_dir, database_dir
>
>This would probably help people figure out "How do I make my own
>templates?" and "How can I turn on fuzzy matching?" easily.
>
>Any thoughts? What breakdowns am I missing? What attributes have I forgotten?
>-Geoff
>
>
>------------------------------------
>To unsubscribe from the htdig3-dev mailing list, send a message to
>[EMAIL PROTECTED] containing the single word "unsubscribe" in
>the SUBJECT of the message.
>

Marjolein Katsma      [EMAIL PROTECTED]
Java Woman - http://javawoman.com/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to