Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Devicemap Wiki" for 
change notification.

The "Patterns2" page has been changed by rezan:
https://wiki.apache.org/devicemap/Patterns2?action=diff&rev1=21&rev2=22

  
  Empty tokens are removed from the tokenization step.
  
- When a token is created and added to the token stream, it can be processed by 
the
+ When a token is added to the token stream, it can be processed by the
  pattern matching step before moving on to the next token. This algorithm is 
pipeline
  and thread safe.
  
+ If the Ngram``Concat``Size is greater than 1, ngrams must be added to the 
token stream ordered largest to smallest.
- If the Ngram``Concat``Size is greater than 1, the largest ngram must be
- made first before creating the smaller ngrams.
  
  
  === Example ===
@@ -124, +123 @@

  
  = Pattern Matching =
  
- This step processes the token stream and returns the highest ranking pattern 
which
+ This step processes the token stream and returns the highest ranking 
candidate pattern.
- matches on the stream (highest ranking candidate).
  
  The pattern file defines a pattern set. Each pattern has 2 main attributes,
  its pattern type and its pattern rank. The pattern
  type defines how the pattern is supposed to be matched against the token 
stream.
  The pattern rank defines how the pattern ranks against other patterns.
  
+ All patterns in the pattern set are evaluated to find the pattern candidates.
- If the pattern type is successfully matched against the stream, it is now a 
candidate
- for being returned. Candidates are ranked against each other using the 
pattern ranking
- and the highest ranking pattern is returned.
  
  All the pattern types in 2.0 are prefixed with 'Simple'. This means that each 
pattern token is matched
  using a plain byte string comparison. No regex or other syntax is allowed in 
Simple patterns.

Reply via email to