Also, here are 2 examples of pure pattern matching algorithms which work 
extremely well (both happen to be written by myself):

-dClass [0]. This project successfully uses pure pattern matching to do device 
classification. This project also is successfully able to do OS detection and 
browser detection using a separate OS and browser index.

-stats.zone [1]. This is a side project I wrote which parses human language in 
the baseball domain. It uses pure pattern matching to accomplish language 
classification (NLP). This should demonstrate the power of pattern matching on 
a much more complicated domain, the English language.

I strongly feel that we need to embrace a full on parallel pattern matching 
algorithm as the path forward in this project. This is why I would like to 
distance this project from the legacy serial user agent parsing algorithm. 
Parsing user agents has failed time and time again due to complexities and an 
ever changing device landscape. Its also slow, complicated, and very error 
prone. Pattern matching is extremely simple, extremely fast, and purely data 
driven. I dont expect the core algorithm to change much over the course of 
major releases, only the data powering the algorithm.

This project is in a very treacherous landscape since device classification is 
not widely accepted. Therefor we need to be as state of the art as possible. We 
also need to be flexible and fast.

[0] https://github.com/TheWeatherChannel/dClass
[1] http://stats.zone/

---
      From: Reza Naghibi <[email protected]>
 To: Devicemap-dev <[email protected]> 
 Sent: Friday, January 2, 2015 12:58 PM
 Subject: Deleting the legacy ODDR client and related artifacts from SVN
   
Any objections to deleting the legacy ODDR java client and its related 
artifacts from SVN? This is purely a code cleanup. Here are my thoughts on this 
matter:

-The legacy client was rewritten a year ago and it offers a huge set of 
improvements. Its simpler, several orders of magnitude faster, more 
predictable, and it moves all of the device logic from code to data. Basically, 
its modern. One of the biggest changes is that the legacy ODDR client loops 
thru every pattern looking for a match, one by one, using a complicated set of 
heuristics specific to each class of user-agents. This does not scale. The new 
client is able to check all patterns in parallel using pure pattern matching. 
This scales extremely well.

-The DDR data can no longer evolve to support the legacy client. While the 
1.0.x releases may work, once 2.0 is released, the legacy client will in no way 
shape or form still work.

-The legacy client is distraction. Its taking focus away from moving our 
current objectives forward. This project, like all projects, must evolve. This 
means rewriting clients, reformatting data, and basically throwing old things 
away. This is a natural process in any software development project. The same 
considerations must be given to old artifacts in this project. This project 
must evolve.

If there are no objections, I will be removing the legacy artifacts from SVN in 
5 days (120 hours).


  

Reply via email to