Of course. Here is how it would look:

https://gist.github.com/rezan/7215f5b8d85db1eed0b0


So that configuration goes into a patch file. The patch files are overlayed 
into the devicemap standard index. So the above XML would go into: 
BuilderDataSourcePatch.xml

I just did a quick test on the java client and it works perfectly :) The one 
thing is that you need to bundle the patch file with the DDR. This makes 
loading from JAR or URL a bit harder since you cannot easily insert a patch 
file into those sources. So you need to download the DDR and load from 
filesystem folder. Im looking to make this easier in future versions.


 

     From: Volkan YAZICI <[email protected]>
 To: [email protected]; Reza Naghibi <[email protected]> 
 Sent: Tuesday, December 9, 2014 11:59 AM
 Subject: Re: Handling Bots and HTTP Clients
   
The model I proposed will not buy us a significant performance gain, which was 
also not my major motivation. (That being said, I also second the idea of 
implementing a benchmark.) Instead, I wanted to address the issue of separating 
the concerns of handling bots and regular devices.
Maybe I better should rephrase my starting point: How can we add new bot and 
HTTP client footprints to the existing DDR?



On Tue Dec 09 2014 at 2:31:24 PM Reza Naghibi <[email protected]> 
wrote:

So let me explain some of the issues with this. Regardless, I would still like 
you to benchmark said patch and share the results. This will help drive the 
direction of future work on the clients.

1) Im almost certain isBot(ua) will perform worse than classify(ua), defeating 
the whole purpose of short circuiting classify. How do you plan on implementing 
isBot()? If that algorithm performs better than classify(), we might as well 
use it to match the entire DDR. No?

2) Under no circumstances should we implement DDR logic in code. The code 
should remain as a generic as possible. This means that its just a plain old 
ngram matcher. This kind of logic belongs in the DDR definition. Right now this 
allows for patterns and ranking. So maybe what you asking is that high ranking 
patterns be checked for first in a very quick way? Well, why are bots so high 
ranking? In normal traffic, bots make up a very small percentage. So wouldnt it 
make sense to check for Samsung and Apple products?

Once again, if possible, please benchmark some before and afters so we can get 
a better idea of what we are working with here. Eventhough im leaning towards 
saying this is a bad idea, I think it is a good exercise.


      From: Volkan YAZICI <[email protected]>
 To: "[email protected]" <[email protected]>
 Sent: Tuesday, December 9, 2014 7:34 AM
 Subject: Handling Bots and HTTP Clients

Hello,

In the context of discussion "how do we handle HTTP clients", I would like
to vote for treating them as bots. Further, I want to propose adding a thin
layer above DeviceMapClient.classify() to make a shortcut for handling of
the bots as follows.

private final static Map<String, String> botAttributes =
Collections.singletonMap("is_bot", "true");

public Map<String, String> classify(String userAgent) {
    if (isBot(userAgent)) return botAttributes;
}

The motivation for this change is as follows:

  - Almost all of the attributes are making no sense for a bot and we are
  losing time to match it against the whole DDR.
  - Bot database will be able to evolve independently.
  - We can come up with a single compiled j.u.regex.Pattern to check bots.
  (I am pretty sure Reza knows a lot better performing approaches, but maybe
  for a future release.)

If the development team is ok with that, I want to implement this feature.

Best.


  


  

Reply via email to