Re: [Tagging] Reviving the conditions debate

Colin Smale Thu, 14 Jun 2012 15:25:46 -0700

Martin, if it walks like a duck and quacks like a duck, then it hadbetter be a duck... What I mean with this, is if the grammar is soEnglish-like such that people are tempted to use constructions which arenot (or not quite) supported by the grammar, or if the way it works iscontrary to how the English language would interpret it, then "errorswill occur". Plus, of course, that the majority of users will not haveEnglish as their first language, and we have to make this accessible tothe man-in-the-street and not allow it to become so obfuscated that onlyPhD's need apply.

Bottom line: I question whether making it kind of pseudo-English-like isthe right goal to aim at. A simple grammar which is (mis)understoodequally over the whole world might be better. Your post below is full ofexamples supporting my point. The grammar should be derived from whatyou are trying to model, just as a (descriptive) grammar for English isreverse-engineered from the way the language is used. If you start withthe premise that the answer must be expressed in ANTLR and shouldn'tinclude brackets, that's putting the cart before the horse. Please feelfree to carry on with your experimentation to see if you can make agrammar on this basis, but remember that humans have to read and writethis stuff (which does not detract from my earlier assertion thatmachine-readability is a slightly higher priority thanhuman-readability) and they often need clear boundaries to make the mostof their creativity. If you put a child in the middle of an "infinitelylarge" field with no boundaries obvious to the child, they won't movefar from where you put them. If you put the same child in a large fencedenclosure, they will explore every inch. Give a child a paintbrush infront of a huge blank wall and you will get a small picture. Mark out a"frame" on the wall and say "paint in this" and it will all get used.Give a mapper no limits on tagging, and many things will not get tagged(because of inner doubts about how to do it). Give the same mapper amenu of 100 tags which can be used, and he will use many more of them.

> Human language is sadly not very precise: "except taxis AND bicycles"does not mean, you must be in a vehicle that is both (it means if nottaxis AND if not bicycles),

The human language here is extremely precise to any fluentEnglish-speaker. It means what it says. It's the IT-based interpretationof the word AND which leads to the grammar misinterpreting theintention. Think of the expression:


    a * b + c * d

To the "untrained eye" this may appear ambiguous or be interpreteddifferently to how a compiler will interpret it. Nonetheless it's validcode and no compiler will complain. However style-wise there is a schoolof thought that such constructions are unsafe because a "bug" caused byprecedence problems is difficult to find by a quick inspection. Mymandating the use of parentheses to make the programmer's intentionsclear to a code-reviewer helps to detect bugs early, and has thedesirable side-effect of making the programmer think just a little bitharder about how it's going to work out. Prevention is better than cure:anything making it less likely that "coding errors" make it into thedatabase is most definitely a Good Thing To Have. Grammars which allow"just about everything" are a pain because they frustrate this errorchecking and delay error detection considerably, often relying on a userto report an anomaly, triggering all kinds of incident management andproblem management processes and costing thousands of times more to fixthan if the input validation had stopped the error occurring in thefirst place.


"If I were king" I would be looking for a system that:
* makes common cases easy
* makes complex cases possible
* makes each rule as standalone as possible (one sign -> one rule)

* does not rely on the user's fluency in English grammar (knowledge of aset of specific words, e.g. tags and functions, is fine)* uses grammatical constructions which are familiar to most people, oreasily learnt

* has a grammar allowing for a user-extensible function repertoire

* allowing user-defined functions to be stored in an external file(accessible at entry and run time)

* fits comfortably with the key-value-pair paradigm of OSM

* can produce a result in a variety of data types including at leastboolean, number, string

* can use the value of other tags as variables

* can use other variables supplied at run time (e.g. weather, time,vehicle type)

* supports the usual comparision and numeric operations
* supports string concatenation operation

Colin

On 14/06/2012 22:19, martinq wrote:

Hi,
sadly this discussion was restarted before I could establish areference implementation for a less technical way of taggingconditional values (for those who are interested: it is a ANTLRgrammar, hopefully with built-in evaluation code). The referenceimplementation is for me a key for acceptance, because the lesstechnical to tag, to more difficult to parse. And we all agree that itshould be possible to parse it - but not necessarily easy.
Objective of my proposal: As less rules as possible - as close aspossible to the sign-posted information.
The proposal page does not contain a lot of information, because Iadapt the "grammar" based on what is feasible. Sadly cannot spend alot of time in continuing with the reference my implementation at themoment.
I will comment based on what I have already figured out:

> - "Or" operators. "Maxspeed is 80 if it's wet or Sunday" can be
> rephrased as "Maxspeed is 80 if it's wet. Maxspeed is 80 if it's
> Sunday." IOW, these can be modelled by using two tags instead of one.
This is in fact the biggest challenge in the current state of myparser (in combination with fuzzy & context related precedence). Humanlanguage is sadly not very precise:"except taxis AND bicycles" does not mean, you must be in a vehiclethat is both (it means if not taxis AND if not bicycles), "except HGVAND weight>7.5t" is by most humans interpreted differently (if not(HGV AND weight>7.5t). There are lot of other examples foramgiguities, eg. "except Fr, 10:00-18:00" does not mean completeFriday and the other days, etc.
However, in this case the preliminary parser has no problem tounderstand following different expressions:
maxspeed:cond = 80 if wet or Sunday

Easy tagging, isn't it?
But the grammar is flexible. Instead of 'if' I also support 'when' and'[' ']' (I am not sure about the brackets yet - they are clearly notas intuitive as 'if' or 'when').
maxspeed:cond = 80 [wet]
maxspeed:cond2 = 80 [Su]

Also understood by the parser:
maxspeed:cond = 80 when weekday is Sunday and condition is wet
or
maxspeed:cond = 80 weekday=Su or condition=wet
or
maxspeed:cond = 80 [wet]; 80 on Sundays

and many other variants. It is almost impossible to tag it wrong.


> - Brackets for expressions. If we limit ourselves to just "and"
> operators, evaluation order doesn't matter.
This is something I really want to avoid. Brackets for precedencepurposes are a purely technical artefact and I not seen them on thesigns with the information we want to tag...
However, the *precedence* is the major problem in the current parser.
Thus I don't think I can write a parser without any rule helping theparser and restricting the mappers. But brackets will just beintroduced if I have no other option.
>> Pseudo-Javascript: (!is_motor_vehicle(vehicle_type)) ||
>> ((vehicle_type='hgv')&&  (time<'10:00' || time>'20:00')&&
>> intention='loading')

== side note ==
A assume, this is access related.
Side-note: The current access tags are IMO just abbreviations,nowadays we would write instead of
hgv=yes --> access:hgv=yes.

With the conditional value proposal it could be tagged as

access=yes if hgv
maxweight=x is an example for (vehicle) access = no if weight>x, eventhough it can also be a non-conditional property of an object (e.g. abridge may have a intrinsic maximum acceptable weight, but we don'thave to go into details now).
== side note end ==


I interpreted your code as your example as

access=yes
access:cond1 = no if motorized
  or "no if motor vehicle" or "no if vehicle is motorized"
access:cond2 = delivery if hgv from 10:00-20:00
  or "delivery if vehicle is a hgv and time is 10:00-20:00
  or many variants more...
If the evuation part of my parser works (yet I still working on thegrammar), I may also be able to create a kind of "normalized"JavaScript expression out of the "fuzzy" human tags [but I don't haveimplemented a normalized attributed tree yet].
> - defining a syntax for time intervals (opening_hours)
By using on/off, this is already the first tag which moved thecondition into the value. By using off/on, it reads like
off if ...
on if ...
However, the author struggled with the same basic problem, e.g. thereis a (non-intuitive) difference between using ',' and ';' now.
Also, except for a basic time restrictions it is impossible to tag andalso difficult to read these expressions. Clearly powerful, but toocompressed for casual mappers. Can you read this?
Dec 25-Su-28 days - Mar Su[-1]: 22:00-23:00
In this case I would stick to human readable expressions like "lastSunday in March" and put the load to evaluate it onto theparser/application.
> - defining a subset hierarchy of vehicle categories (such as
> "motor_vehicle" including "hgv" as a subset)
Applications must know which vehicle you drive to evaluate certainconditional values (mainly access, speed limits and also parkingconditions). Unlike the current access tags we don't need an hierarchy.
motorized is an attribute of the vehicle, the application must knowabout it. Not every taxi is motorized [rickshaws or bicycle taxis],the attributes for the circumstances of use should be interdependentlydetermined by the application. The applicaton may use a treeinternally, but I don't think its the mapper's job.
For a standard taxi a application may work with following attributes:
taxi = yes
weight = 2t
motorized = yes
width = 2m
length = 5m
height = ...
permits = A38, A39
public service = yes
[however, if the taxi is empty, this attribute may be no]
passengers = 2
sex = male
age = 55
engine power = 22PS
vehicle maxspeed = 180
colour = black
etc.
[I hope the concept is clear]
Independent from that does the application also require knowledgeabout date & time for certain conditions - and the application mayneed to know the weather, holidays, purpose of travel (or intention),destination of travel (especially in my area several accessrestrictions depend on the area you driving to).
The mappers no longer have to worry about interdependencies betweensuch attributes, e.g is a rickshaw a motor vehicle or a bicycle orjust a pedestrian with a trailer? (It can be all).
Now, whenever some properties are not known, the application simplycannot evaluate all conditional value. Either the application canaccept this and continues with the safety value (e.g. no for access,lowest speed for maxspeed, etc.) or may have to ask the user (e.g.user can select if the bicycle map or the motor vehicle map should beshown).
> So how do the existing proposals fit in here? Well, the primary
> difference between the example above and "extended conditions" is that
> the latter aims for for short, manually editable strings by letting
> literals for vehicle classes, weather conditions etc., as well as time
> intervals, stand for themselves - based on the assumption that a parser
> will be able to unambiguously identify them.

1) I dislike proposals which try to solve only the situation for access
access is a tag like any other - and we shouldn't re-invent the wheel:
access:cond = yes if time is 10:00-18:00  (or simply yes [10:00-18:00])
maxspeed:cond = 80 if time is 10:00-18:00
parking:lane:left:cond = parallel time is 10:00-18:00
open:cond = yes time is 10:00-18:00

or - simply to emphasize the concept -
access:cond = yes if female
maxspeed:cond = 80 if female
parking:lane:left:cond = parallel if female
open:cond = yes if female
The extended access proposal can be used for any tags, thus no issuehere:
access:hgv:(20:00-10:00) = ...
maxspeed:hgv:(20:00-10:00) = ...
parking:lane:left:hgv:(20:00-10:00) = ...
...


2) Value vs. key
I think value side conditions would be more intuitive, because thevalue depends on the condition.
Also, it easier to filter the things in the database, especially ifleft/right & forward/backward is also mixed into the conditions (orshould we simple go the next step and see them as condition too?).
Disadvantage is that values can contain any characters. This makes ithard to identify the start of the condition in a parser.
3) The extended access proposal:

> motor_vehicle = no
> hgv:(20:00-10:00):loading = yes

Normal form is nice to parse - but do you think everybody can map it?
Non-normal form is not so nice for machines - thus I cannot promisethat I achieve to parse it - and the discussion is theoretical until Ican prove it (with reference implementation).
I also see no reason why an application may not be able to treat thisas equivalent:
hgv = yes    (shortcut for:)
access:hgv = yes (which is a valid expression also in the proposedextension)
access:cond = yes [hgv]
Then backward compatibility, extended condition tags or value sideconditions could be used. If applications need a parser anyway...
martinq

_______________________________________________
Tagging mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/tagging




_______________________________________________
Tagging mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/tagging

Re: [Tagging] Reviving the conditions debate

Reply via email to