Hello Joachim,

Joachim Jacob | VIB | wrote, On 03/26/2013 10:01 AM:
> 
> abrt was filling the root directory indeed. So disabled it.
> 
> I have done some exporting tests, and the behaviour is not consistent.
> 
> 1. *size*: in general, it worked out for smaller datasets, and usually 
> crashed on bigger ones (starting from 3 GB). So size is key?
> 2. But now I have found several histories of 4.5GB that I was able to 
> export... So far for the size hypothesis.
> 
> Another observation: when the export crashes, the corresponding webhandler 
> process dies.
> 

A crashing python process crosses the fine boundary between the Galaxy code and 
Python internals... perhaps the Galaxy developers can help with this problem.

It would be helpful to find a reproducible case with a specific history or a 
specific sequence of events, then someone can help you with the debugging.

Once you find a history that causes a crash (every time or sometimes, but in a 
reproducible way), try to pinpoint when exactly it happens:
Is it when you start preparing the export (and "export_history.py" is running 
as a job), or when you start downloading the exported file.
(I'm a bit behind on the export mechanism, so perhaps there are other steps 
involved?).

Couple of things to try:

1. set "cleanup_job=never" in your universe_wsgi.ini - this will keep the 
temporary files, and will help you re-produce jobs later.

2. Enable "abrt" again - it is not the problem (just the symptom).
You can cleanup the "/var/spool/abrt/XXX" directory from previous crash logs, 
then reproduce a new crash, and look at the collected files (assuming you have 
enough space to store at least one crash).
In particular, look at the file called "coredump" - it will tell you which 
script has crashed.
Try running:
    $ file /var/spool/abrt/XXXX/coredump
    coredump ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, 
from 'python XXXXXX.py'

Instead of "XXXX.py" it would show the python script that crashed (hopefully 
with full command-line parameters).

It won't show which python statement caused the crash, but it will point in the 
right direction.

> So now I suspect something to be wrong with the datasets, but I am not able 
> to trace something meaningful in the logs.  I am not confident in turning on 
> logging in Python yet, but apparently this happens with the module "logging" 
> initiated like logging.getLogger( __name__ ).
> 

It could be a bad dataset (file on disk), or a problem in the database, or 
something completely different (a bug in the python archive module).
No point guessing until there are more details.

-gordon
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to