Unfortunately, things remain as before. Here is some additional debug output that I found in the *inferior-lisp* buffer of my emacs session:
Deserialization error in map: returning nil for element transaction has active cursors PANIC: Invalid argument PANIC: DB_RUNRECOVERY: Fatal error, run database recovery PANIC: fatal region error detected; run recovery The deserialization error happens in one of two spots: - when trying to get-from-root a btree or indexed btree - when trying to get a pset via a call to get-value on the btree Running recovery does not help; the errors start as soon as I load and execute my code post-recovery. Ian, can you enlighten me as to where in the code you believe these errors to originate? Also, can you think of anything that I might be doing incorrectly in my code that would cause this sort of behavior? Here is a brief outline of what I am doing: There is - a waitqueue (via sbcl's sb-thread:make-waitqueue) - a tcp listener thread that writes its input to an incoming work queue - 3 reader threads who use the waitqueue to get work items from the work queue The readers parse the incoming data and create a persistent object from it. They also index the individual words of one of the persistent object's slots using a btree. These actions are contained in a with-transaction block. There is also a hunchentoot web server thread running. Users can search the indexed text via this web interface. It is really rather simple. The errors crop up randomly, sometimes when I am running searches via the web interface while the listener and reader threads are active, and sometimes while the listener and reader threads are doing their thing without any web-based reads happening. Cheers, Kevin Kevin Raison wrote: > Thanks, Ian. All tests pass for me as well. I am running my > application with the new code and will let you know how it goes. > > Thanks again! > Kevin > > Ian Eslick wrote: >> I just checked in a few more fixes a few minutes ago. I finally was >> able to reproduce some of this locally. Try cleaning out the test >> database and re-running the tests. Everything passes for me on a >> fresh DB and my own application is running fine too. >> >> Ian >> >> On Jan 10, 2009, at 2:04 PM, Kevin Raison wrote: >> >>> After pulling the latest patches and rebuilding without >>> optimizations as >>> you suggest, I get the following errors, the second of which is >>> slightly >>> different than what I was receiving previously. >>> >>> First, a deserialization error: >>> Condition ELEPHANT-TYPE-DESERIALIZATION-ERROR was signalled. >>> [Condition of type ELEPHANT-TYPE-DESERIALIZATION-ERROR] >>> >>> Restarts: >>> 0: [RETRY] Retry SLIME REPL evaluation request. >>> 1: [ABORT] Return to SLIME's top level. >>> 2: [TERMINATE-THREAD] Terminate this thread (#<THREAD "repl-thread" >>> RUNNING {10034F31F1}>) >>> >>> Backtrace: >>> 0: ((LAMBDA (SWANK-BACKEND::DEBUGGER-LOOP-FN)) #<FUNCTION (LAMBDA #) >>> {10040B2389}>) >>> 1: (SWANK::DEBUG-IN-EMACS #<ELEPHANT-TYPE-DESERIALIZATION-ERROR >>> {1002CB3531}>) >>> 2: (SWANK:INVOKE-SLIME-DEBUGGER #<ELEPHANT-TYPE-DESERIALIZATION- >>> ERROR >>> {1002CB3531}>) >>> 3: ((LAMBDA (SWANK-BACKEND::HOOK SWANK-BACKEND::FUN)) #<FUNCTION >>> SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA #) {1002CB3AC9}>) >>> 4: (INVOKE-DEBUGGER #<ELEPHANT-TYPE-DESERIALIZATION-ERROR >>> {1002CB3531}>) >>> 5: ((FLET #:FUN23) #<ELEPHANT-TYPE-DESERIALIZATION-ERROR >>> {1002CB3531}>) >>> 6: ((SB-PCL::FAST-METHOD ELEPHANT::EXECUTE-TRANSACTION >>> (DB-BDB::BDB-STORE-CONTROLLER T)) ..)[:EXTERNAL] >>> 7: (SB-INT:SIMPLE-EVAL-IN-LEXENV (REINDEX-LOG-ENTRIES) #<NULL- >>> LEXENV>) >>> 8: (SWANK::EVAL-REGION "(reindex-log-entries)\n") >>> 9: ((LAMBDA ())) >>> 10: (SWANK::TRACK-PACKAGE #<CLOSURE (LAMBDA #) {1002C596F9}>) >>> 11: (SWANK::CALL-WITH-RETRY-RESTART "Retry SLIME REPL evaluation >>> request." #<CLOSURE (LAMBDA #) {1002C59619}>) >>> 12: (SWANK::CALL-WITH-BUFFER-SYNTAX NIL #<CLOSURE (LAMBDA #) >>> {1002C595E9}>) >>> 13: (SWANK::REPL-EVAL "(reindex-log-entries)\n") >>> >>> And then this one with any subsequent access to the bdb: >>> >>> Bad type argument: >>> >>> >>> BDB-DB-ERROR >>> [Condition of type SIMPLE-TYPE-ERROR] >>> >>> Restarts: >>> 0: [RETRY] Retry SLIME REPL evaluation request. >>> 1: [ABORT] Return to SLIME's top level. >>> 2: [TERMINATE-THREAD] Terminate this thread (#<THREAD "repl-thread" >>> RUNNING {10034F31F1}>) >>> >>> Backtrace: >>> 0: (MAKE-CONDITION BDB-DB-ERROR)[:EXTERNAL] >>> 1: (ERROR BDB-DB-ERROR)[:EXTERNAL] >>> 2: ((SB-PCL::FAST-METHOD ELEPHANT::EXECUTE-TRANSACTION >>> (DB-BDB::BDB-STORE-CONTROLLER T)) #<unused argument> #<unused >>> argument> >>> #<BDB-STORE-CONTROLLER /srv/console/db/> #<CLOSURE (LAMBDA #) >>> {10037C53D9$ >>> 3: (ADD-LOG-ENTRY)[:EXTERNAL] >>> 4: (SB-INT:SIMPLE-EVAL-IN-LEXENV (ADD-LOG-ENTRY :SEVERITY 2 >>> :HOST-UUID "123" :TIMESTAMP ...) #<NULL-LEXENV>) >>> 5: (SWANK::EVAL-REGION "(add-log-entry :severity 2 :host-uuid >>> \"123\" >>> :timestamp (get-universal-time)\n :program >>> \"blah\" :text \"test entry\")\n") >>> 6: ((LAMBDA ())) >>> 7: (SWANK::TRACK-PACKAGE #<CLOSURE (LAMBDA #) {10037C1BF9}>) >>> 8: (SWANK::CALL-WITH-RETRY-RESTART "Retry SLIME REPL evaluation >>> request." #<CLOSURE (LAMBDA #) {1003636499}>) >>> 9: (SWANK::CALL-WITH-BUFFER-SYNTAX NIL #<CLOSURE (LAMBDA #) >>> {1003636469}>) >>> 10: (SWANK::REPL-EVAL "(add-log-entry :severity 2 :host-uuid \"123\" >>> :timestamp (get-universal-time)\n :program \"blah\" >>> :text \"test entry\")\n") >>> 11: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SWANK:LISTENER-EVAL "(add-log- >>> entry >>> :severity 2 :host-uuid \"123\" :timestamp (get-universal-time)\n >>> :program \"blah\" :text \"test entry\")\n"$ >>> 12: (SWANK::EVAL-FOR-EMACS (SWANK:LISTENER-EVAL "(add-log-entry >>> :severity 2 :host-uuid \"123\" :timestamp (get-universal-time)\n >>> :program \"blah\" :text \"test entry\")\n") "CONSO$ >>> 13: (SWANK::PROCESS-REQUESTS NIL) >>> 14: ((LAMBDA ())) >>> 15: ((LAMBDA (SWANK-BACKEND::HOOK SWANK-BACKEND::FUN)) #<FUNCTION >>> SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA #) {10038AD269}>) >>> 16: (SWANK::CALL-WITH-REDIRECTED-IO #<SWANK::CONNECTION {10032FB281}> >>> #<CLOSURE (LAMBDA #) {10038AD289}>) >>> 17: (SWANK::CALL-WITH-CONNECTION #<SWANK::CONNECTION {10032FB281}> >>> #<CLOSURE (LAMBDA #) {10038AD269}>) >>> 18: (SWANK::HANDLE-REQUESTS #<SWANK::CONNECTION {10032FB281}> NIL) >>> 19: (SWANK::CALL-WITH-BINDINGS NIL #<CLOSURE (LAMBDA #) >>> {10034F5189}>) >>> 20: ((FLET SB-THREAD::WITH-MUTEX-THUNK)) >>> 21: ((FLET #:WITHOUT-INTERRUPTS-BODY-[CALL-WITH-MUTEX]477)) >>> 22: (SB-THREAD::CALL-WITH-MUTEX ..) >>> 23: ((LAMBDA ())) >>> 24: ("foreign function: call_into_lisp") >>> 25: ("foreign function: new_thread_trampoline") >>> >>> >>> Ian Eslick wrote: >>>> Also try pushing :elephant-without-optimize onto *features* prior to >>>> doing a force rebuild of elephant. This should give you some more >>>> information and rule out any optimization/declaration related bugs. >>>> >>>> Thank you, >>>> Ian >>>> >>>> >>>> On Jan 7, 2009, at 9:10 PM, Kevin Raison wrote: >>>> >>>>> I was finally able to recreate this in the repl (as opposed to >>>>> seeing it >>>>> in my error logs), so here is a trace: >>>>> >>>>> The slot DB-BDB::INDICES-CACHE is unbound in the object >>>>> #<BDB-INDEXED-BTREE oid:2>. >>>>> [Condition of type UNBOUND-SLOT] >>>>> >>>>> Backtrace: >>>>> 0: ((SB-PCL::FAST-METHOD SLOT-UNBOUND (T T T)) #<unavailable >>>>> argument> #<unavailable argument> #<unavailable argument> >>>>> #<BDB-INDEXED-BTREE oid:2> DB-BDB::INDICES-CACHE) >>>>> 1: (SB-PCL::SLOT-UNBOUND-INTERNAL #<BDB-INDEXED-BTREE oid:2> 2) >>>>> 2: ((SB-PCL::FAST-METHOD (SETF GET-VALUE) (T T >>>>> DB-BDB::BDB-INDEXED-BTREE)) #<unavailable lambda list>) >>>>> 3: ((LAMBDA (WORD)) "34766") >>>>> 4: (SB-IMPL::MAP1 #<CLOSURE (LAMBDA #) {1003CA4ED9}> (("" "asa" >>>>> "106007" "deny" "inbound" "udp" ...)) :LIST T) >>>>> 5: (MAPCAR #<CLOSURE (LAMBDA #) {1003CA4ED9}> ("" "asa" "106007" >>>>> "deny" "inbound" "udp" ...))[:EXTERNAL] >>>>> 6: ((SB-PCL::FAST-METHOD INDEX-LOG-ENTRY (LOG-ENTRY)) #<unavailable >>>>> argument> #<unavailable argument> #<LOG-ENTRY oid:6373>) >>>>> 7: ((LAMBDA ())) >>>>> 8: ((SB-PCL::FAST-METHOD ELEPHANT::EXECUTE-TRANSACTION >>>>> (DB-BDB::BDB-STORE-CONTROLLER T)) #<unavailable argument> >>>>> #<unavailable >>>>> argument> #<unavailable argument> #<unavailable argument>) >>>>> [:EXTERNAL] >>>>> 9: (ADD-LOG-ENTRY ..) >>>>> 10: (SB-INT:SIMPLE-EVAL-IN-LEXENV ..) >>>>> >>>>> And here is the code that sometimes causes the issue: >>>>> (defmethod index-log-entry ((log-entry log-entry)) >>>>> (let ((*store-controller* *syslog-controller*)) >>>>> (let ((btree (get-from-root "log-entry-index" :sc >>>>> *syslog-controller*))) >>>>> (map-words #'(lambda (word) >>>>> (unless (or (member word *superfluous- >>>>> words* :test >>>>> #'string-equal) >>>>> (< (length word) 2)) >>>>> (unless (existsp word btree) >>>>> (setf (get-value word btree) (make-pset :sc >>>>> *syslog-controller*))) >>>>> (insert-item log-entry >>>>> (get-value word btree)))) >>>>> (text log-entry))))) >>>>> >>>>> >>>>> If I drop and recreate the btree, everything runs smoothly (no >>>>> errors) >>>>> for a few hours and then the above error message starts showing up >>>>> about >>>>> half to three quarters of the time. >>>>> >>>>> Thanks again, >>>>> Kevin >>>>> >>>>> >>>>> Kevin Raison wrote: >>>>>> I am seeing an intermittent error with 1.0 alpha when trying to >>>>>> write to >>>>>> an indexed btree (using BerkeleyDB 4.7 as provided by Ubuntu's >>>>>> package >>>>>> repositories): >>>>>> >>>>>> The slot DB-BDB::INDICES-CACHE is unbound in the object >>>>>> #<BDB-INDEXED-BTREE oid:2> >>>>>> >>>>>> Within the same thread, sometimes this happens and sometimes I am >>>>>> able >>>>>> to read and write to the btree. I am using sbcl 1.0.24 on 32 bit >>>>>> intel >>>>>> linux. Is there something obvious that might help alleviate this >>>>>> or >>>>>> should I provide more context? >>>>>> >>>>>> Thanks! >>>>>> Kevin >>>>>> >>>>>> _______________________________________________ >>>>>> elephant-devel site list >>>>>> elephant-devel@common-lisp.net >>>>>> http://common-lisp.net/mailman/listinfo/elephant-devel >>>>> _______________________________________________ >>>>> elephant-devel site list >>>>> elephant-devel@common-lisp.net >>>>> http://common-lisp.net/mailman/listinfo/elephant-devel >>>> _______________________________________________ >>>> elephant-devel site list >>>> elephant-devel@common-lisp.net >>>> http://common-lisp.net/mailman/listinfo/elephant-devel >>> _______________________________________________ >>> elephant-devel site list >>> elephant-devel@common-lisp.net >>> http://common-lisp.net/mailman/listinfo/elephant-devel >> >> _______________________________________________ >> elephant-devel site list >> elephant-devel@common-lisp.net >> http://common-lisp.net/mailman/listinfo/elephant-devel > > _______________________________________________ > elephant-devel site list > elephant-devel@common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel _______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel