On Thu, Oct 2, 2014 at 8:22 AM, Andy Bennett <andy...@ashurst.eu.org> wrote:

> Hi,
>
> I am trying to use the browscap.org database to do HTTP User Agent
> Classification.
>
> This database consists of a (large) number of regexes and data about the
> browser should the user agent string match that regex.
>
> What I want to do is compile all the regexes together and be able to add
> annotations such that I can match a UA string against this regex and get
> back an idea of which pattern matched so that I can look up the
> appropriate data.
>
> i.e. I have a data structure keyed by "pattern" and I want to my input
> to be something that matches that pattern rather than the pattern itself.
>
> It seems that for this I need "Callbacks" but I don't really need full
> callback support: I don't necessarily need to call an actual procedure
> and I don't need to replace anything: I'm not doing a search/replace,
> just a match. "All" I really need is to be able to annotate the FSM node
> that matched with a little bit of data that I can get back.
>


You could use submatch info and check which submatch matched.
This would keep the matching as a single regexp, but you'd then
need a linear scan to see which submatch succeeded.

(define (irregex-merge-vector vec)
  (irregex `(or ,@(map (lambda (x) `(=> alt ,x)) (vector->list vec)))))

(define ua-vec ...)
(define all-ua-rx (irregex-merge-vector ua-vec))

(define (maybe-match-ua ua)
  (cond
    ((irregex-match all-ua-rx ua)
     => (lambda (m)
             (vector-reg ua-vec (irregex-match-numeric-index 'match-ua m
'(alt)))))
    (else
      #f)))

although I believe irregex-match-numeric-index is not exported.
It's worth having a utility for this idiom.

-- 
Alex


>
> Is this something that would be easy to add to irregex or can anyone
> suggest any other alternative implementations that I might consider?
>
>
> The PHP library that uses this browscap database (apparently) just does
> a linear search by trying to match each regex in turn but I'd rather
> keep that approach as a last resort.
>
>
>
> Thanks for your help and any tips you can offer.
>
>
>
> Regards,
> @ndy
>
> --
> andy...@ashurst.eu.org
> http://www.ashurst.eu.org/
> 0x7EBA75FF
>
>
_______________________________________________
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Reply via email to