What about configuring robots.txt?

On Sat, Apr 6, 2013 at 1:50 PM, Nicolas Bouillon <nico...@bouillon.net>wrote:

> Thanks for your thoughts, and that leads me to the following suggestion to
> resolve my problem:
>
> I've already contributed a custom RequestExceptionHandler using
> binder.bind(RequestExceptionHandler.class,
> CustomRequestExceptionHandler.class)
>               .withId("CustomRequestExceptionHandler");
>
> That is this service which is responsible of my log error message
>
> (org.apache.tapestry5.internal.services.DefaultRequestExceptionHandler#handleRequestException).
>
> In my Custom RequestExceptionHandler, I could check the user agent to match
> against bots, and then instead of rendering the exception page, give them a
> 404 response code so they could stop crawling the page (and me skipping
> logger.error at the top of the handleRequestException method).
>
> Best regards.
>
>
>
> 2013/4/6 Muhammad Gelbana <m.gelb...@gmail.com>
>
> > When you create a tapestry project using maven's archtype, it creates the
> > below request filter (slightly modified) to log how much time each
> request
> > consumed.
> >
> > // Service building
> > public RequestFilter buildTimingFilter(final Logger log) {
> >     return new RequestFilter() {
> >         @Override
> >         public boolean service(Request request, Response response,
> > RequestHandler handler) throws IOException {
> >             long startTime = System.currentTimeMillis();
> >             try {
> >                 // The responsibility of a filter is to invoke the
> > corresponding method
> >                 // in the handler. When you chain multiple filters
> > together, each filter
> >                 // received a handler that is a bridge to the next
> filter.
> >                 return handler.service(request, response);
> >             } finally {
> >                 long elapsed = System.currentTimeMillis() - startTime;
> >                 if (TimeUnit.MILLISECONDS.toSeconds(elapsed) >= 10) {
> >                     log.warn(String.format("Request time: %d ms",
> > elapsed));
> >                  }
> >             }
> >         }
> >     };
> > }
> >
> > // Service contribution
> > public void contributeRequestHandler(OrderedConfiguration<RequestFilter>
> > configuration, @Local RequestFilter filter) {
> >     // Each contribution to an ordered configuration has a name, When
> > necessary, you may
> >     // set constraints to precisely control the invocation order of the
> > contributed filter
> >     // within the pipeline.
> >     configuration.add("Timing", filter);
> > }
> >
> > From there, if you find a pattern for requests coming from bots, you can
> > drop these requests if that suits you. This way these bots will also
> learn
> > that the links they are requesting doesn't exists anymore and will
> > eventually stop bothering you, if they are smart enough !
> >
> > Regards
> >
> >
> > On Fri, Apr 5, 2013 at 10:08 PM, Nicolas Bouillon <nico...@bouillon.net
> > >wrote:
> >
> > > Dear all,
> > >
> > > I'm working on a e-commerce website in Tapestry 5 since a couple of
> years
> > > and we are making our website evolving constantly, adding some features
> > and
> > > changing stuff here and there.
> > >
> > > I'm closely monitoring application logs but i'm very annoyed by robots
> > who
> > > reminds some URLs that are not valid anymore.
> > >
> > > For example, they remind (or follow old links) to webpages that used to
> > be
> > > coded in PHP and where the URL contained special chars, not allowed in
> > > Tapestry URLs. Or those spiders try to access an URL of a grid pager
> > event,
> > > but the Grid component is not there anymore (or has a different name).
> > Each
> > > of those hit generate a log.error message, and that hide the important
> > > errors messages inside many noise. (I know, "Cool URI don't change",
> but
> > > for page events, it could be quite had to keep old URLs...)
> > >
> > > The error log is something like that :
> > > org.apache.tapestry5.ioc.util.UnknownValueException: Component
> > > product/domain/PriceList does not contain embedded component 'v3grid'.
> > >
> > > It's seems to be too wide to ignore totally "UnknownValueException" log
> > > appender, because it might hide real mistakes in the web application.
> > >
> > > Is there a way to avoid this kind of behavior ? How do you treat error
> > logs
> > > from your applications ?
> > >
> > > Thanks.
> > >
> > > Nicolas.
> > >
> >
>



-- 
BR
Ivan

Reply via email to