Hi Am 12.08.2013 um 14:02 schrieb Ralph Böhme <r...@netafp.com>: > Am 12.08.2013 um 11:32 schrieb Martyn Russell <mar...@lanedo.com>: >> On 11/08/13 09:24, Ralph Böhme wrote: >>> As I'm only really at the beginning of an in depth analysis, I can't >>> say for sure that the hangs I see are the cause of this, but knowing >>> there seems to exist a fundamental design flaw in tracker-extra-pdf, >>> I'm asking for thoughts on this. >> >> So we use the parent/child set up because some PDFs take a REALLY long time >> to process and we have a 10 second window for them to be indexed. > > Yes, I know. My test data contains many, many very large PDFs (up 1000 pages). > >> After that we kill the child process and return. > > Understood. Just for references, this happens in > > src/tracker-extract/tracker-extract-pdf.c:extract_content(). > >> We did this because we didn't want to kill the tracker-extract process all >> the time. > > I just stumbled across this one, didn't notice before. A 20 seconds watchdog > timeout setup in > > src/tracker-extract/tracker-controller.c:metadata_data_new(). > >> In reality, this is actually what tracker-extract was built to do, so ... > > It seems in reality the 20 s watchdog timer is setup too early! Afair it is > not setup before the actual extraction is started, but already at the moment > the extraction event is received. > > Now if several extraction requests a received and all of these are files > (e.g. PDFs) that take long to extract, the watchdog timers for requests still > on the pending queue are already expiring, I see this sequence of events in > the debug log: > > > > 12 Aug 2013, 11:55:49: Tracker: <--- [12|0] > handle_method_call_get_metadata_fast > > (uri:'file:///tank/test/PDF/Security/Network%20Security%20With%20Openssl.pdf', > mime:'application/pdf', index_fd:0) > > Extraction requests 1-11 are still being processes, so it takes some time > before this one gets it shot here: > > 12 Aug 2013, 11:56:06: Tracker: Dispatching > 'file:///tank/test/PDF/Security/Network%20Security%20With%20Openssl.pdf' > in main thread > > Unfortunately this one takes more then 4 seconds, so the watchdog killer > steps in: > > 12 Aug 2013, 11:56:10: Tracker-Critical **: Extraction task for > 'file:///tank/test/PDF/Security/Network%20Security%20With%20Openssl.pdf' > went rogue and took more than 20 seconds. Forcing exit. > > ...and tracker-extract gets restarted. > > Increasing WATCHDOG_TIMEOUT from 20 to a large value will workaround this, > but a proper fix would be to setup the timer in > src/tracker-extract/tracker-extract.c:dispatch_task_cb() or similar. > >>> Afaict, the right design would involve an exec() in the child and >>> using some other IPC channel. I'll happily volunteer. >> >> Yea, so we are actually calling exit() in the child. See: >> >> extract_content_child_process() > > What has the exit() call to do with this? Afaict that's completely unrelated > to the issue of using fork() in programs using glib. > >>> Thoughts? >> >> Are you sure it isn't a difficult PDF taking too long? > > Yes, heres a SBT from a stuck process: ...
now here's another one, different cause this time: ----------------- lwp# 2 / thread# 2 -------------------- fe6d4749 lwp_park (0, 0, 0) fe6cc93b mutex_lock_impl (fe773098, 0, 0, fe671657, 80c8000, 1) + 166 fe6cca7e mutex_lock (fe773098) + 16 fe670fb5 malloc (2c, fe772000, fddee368, fe6cbb23, fe7730a4, 1000000) + 2d fe39d565 g_malloc (2c, fe772000, fddee398, fe6cbb23, fe7730a4, 0) + 25 fe3e2d0f _g_gnulib_printf_parse (fcff4a6c, fddee420, fddee410, fe773098, fe773098, 0) + 2f fe3e36a0 _g_gnulib_vasnprintf (0, fddee46c, fcff4a6c, fddee58c, fddee480, 6) + 30 fe3e4df3 _g_gnulib_vasprintf (fddee4bc, fcff4a6c, fddee58c, fddee4ac, fe447474, 80) + 33 fe3de8a4 g_vasprintf (fddee4bc, fcff4a6c, fddee58c, 2c, fde1a000, fe772000) + 34 fe3b7962 g_strdup_vprintf (fcff4a6c, fddee58c, 0, fe6cbcc9, fe7730a4, 1000000) + 32 fe39ecaa g_logv (fcff4fa9, 80, fcff4a6c, fddee58c, fdce0240) + 3a fe39f022 g_log (fcff4fa9, 80, fcff4a6c, fe6c34ee, fe777000, 0) + 32 fcff2728 extract_content_child_cleanup (12, 0, fddee6b0, fe6ae447, 12, fe772000) + 38 fe6d47d5 __sighndlr (12, 0, fddee6b0, fcff26f0) + 15 fe6c7dae call_user_handler (12) + 2d2 fe6c8007 sigacthandler (12, 0, fddee6b0) + ee --- called from signal handler with signal 18 (SIGCLD) --- fe671033 _malloc_unlocked (5) + 43 fe670fc0 malloc (5, fe27cac2, fddeec88, fe3dd554, 884e088, 8085800) + 38 fe39d565 g_malloc (5, 0, 4, fe3cf3ac, 884e088, fe27cac1) + 25 fe3b756b g_strdup (819a800, 0, fddeecd8, fe23403c, 857d950, 0) + 3b fe246a88 _g_dbus_method_invocation_new (819a800, 8105320, 813fa50, 8110288, 80c4da8, 8085800) + 188 fe22c588 validate_and_maybe_schedule_method_call (1, 0, 80bfdd0, 80c2070, 80bff98, 8083058) + e8 fe22d361 on_worker_message_received (808ab10, 8091c30, 8085800, fddeee08, fe447474, 807e590) + 4b1 fe242c0e _g_dbus_worker_do_read_cb (8088828, 8090618, 808ab10, fe6cd084, 80819b0, 0) + 72e fe1d5515 g_simple_async_result_complete (8090618, 0, fddeee88, 8087358, 85b0ac0) + b5 fe1d557d complete_in_idle_cb (8090618, fe447474, fddeeef8, fe447474, fe3df359) + 1d fe393c71 g_idle_dispatch (85b0ac0, fe1d5560, 8090618, fddeeed0, 8081ab0, 1) + 21 fe397285 g_main_context_dispatch (808dca8, 0, 8081ab0, 1, 3, 0) + 135 fe397648 g_main_context_iterate.isra.21 (1, 808dca8, 0, fe397aaa, fe2a189c, 8088488) + 218 fe397b23 g_main_loop_run (80884b8, 807db80, 0, fe6cd275, fe447474, 807db80) + 83 fe24070c gdbus_shared_thread_func (8088488, 807db80, 208, fe6d37e9, 0, 0) + 2c fe3c0442 g_thread_proxy (807db80, fe772000, fddeefe8, fe6d43f9) + 72 fe6d444c _thrp_setup (fdce0240) + 9d fe6d46f0 _lwp_start (fdce0240, 0, 0, 0, 0, 0) ----------------- lwp# 3 / thread# 3 -------------------- Looks like we're using malloc() in a signal handler. That must be l.469 in tracker-extract-pdf.c: g_debug ("Parent: Zombie %d reaped", child_pid); I wonder though why this is evaluated, as my tracker log setting are only "detailed": root@beast:~# tracker-control --get-log-verbosity Komponenten: Store : detailed Extract : detailed Writeback: detailed Indizierer (Nur jene, die in der Konfiguration aufgelistet sind): Files : detailed root@beast:~# I'll run another test with the log statements removed and report back. -r -- Ralph Böhme <r...@netafp.com> Netatalk Developer | Support | Services Curslacker Deich 254, 21039 Hamburg, Germany http://www.netafp.com/
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ tracker-list mailing list tracker-list@gnome.org https://mail.gnome.org/mailman/listinfo/tracker-list