Hi, Victor,

> Can I disable those error messages?

Certainly, you can set the loglevel for Cocoon here:

https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace/config/log4j.properties#L83-L84

If you don't need the INFO messages, you can set this to WARN, FATAL or OFF 
instead. [1]

I actually just delete these log files via a cron job after a short window on 
most of our boxes... as the info is potentially useful in the short term, but 
useless after a day or two.

--Hardy

[1] https://logging.apache.org/log4j/2.x/manual/customloglevels.html

________________________________
From: Victor [[email protected]]
Sent: Tuesday, February 16, 2016 3:18 PM
To: Pottinger, Hardy J.
Subject: Re: [dspace-tech] Understanding / Debugging "ERROR cocoon.handled - 
Failed to process pipeline"

Hi Hardy; I posted another update and here is a summary / a few more thoughts.

My latest suspicion is that the majority of errors I am finding are pertaining 
to a requests for 'page that doesn't exist' rather than problems in the 
pageNotFound algorithm or the XML breaking down.

Can I disable those error messages? The cocoon files can end up getting huge 
since our site is being actively crawled and sitemaps only end up being 
suggestions.

Typing in a bogus URL generates a huge message (1500 lines) about why the page 
couldn't be shown. E.g., a handle that doesn't exist.

This doesn't seem like it would be a new problem for sites to be crawled and 
have these massive files getting populated as a result. There were 36 instances 
of failed to process pipeline that occurred between midnight and 1 am today 
which if that pattern continues would be 1.2 million lines for a day logging 
page not found errors.

Did you have any other concerns about protodocument.xml since I found it in the 
official dspace files?

When I entered fake URLs, I can find them fairly easily but the ones that 
happen when presumably only crawlers are active look a little different in the 
INFO remarks before the stack trace.

Some examples are:
(these often show #s 9->1 just past DRI)
2016-02-16 00:39:00,754 INFO  org.apache.cocoon.caching.impl.CacheImpl  - Cache 
MISS for 
PK_G-aspect-cocoon://DRI/9/wordpress/wp-admin/?pipelinehash=477348137205522819_T-Navigation--5395830111285582982
2016-02-16 00:44:51,853 INFO  org.apache.cocoon.caching.impl.CacheImpl  - Cache 
MISS for 
PK_G-aspect-cocoon://DRI/9/blog/wp-admin/?pipelinehash=477348137205522819_T-Navigation-2792154001163817932
2016-02-16 00:55:34,972 INFO  org.apache.cocoon.caching.impl.CacheImpl  - Cache 
MISS for 
PK_G-aspect-cocoon://DRI/9/favicon.ico?pipelinehash=477348137205522819_T-Navigation-4797627539972478138
2016-02-16 00:56:21,520 INFO  org.apache.cocoon.caching.impl.CacheImpl  - Cache 
MISS for 
PK_G-aspect-cocoon://DRI/9/old/wp-admin/?pipelinehash=9065269893901859459

Oddly enough, they are predominantly favicon.ico showing up just before the 
error. But any ideas you may have on preventing this error would be much 
appreciated. Thanks!
  --Victor

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to