[libxml-devel] SEGV with libxml-ruby 1.0.0
Having upgraded to libxml-ruby 1.0.0 yesterday I am now seeing repeatable crashes in the garbage collection. The end of the trace looks like: #0 rxml_attr_mark (xattr=0x0) at ruby_xml_attr.c:41 #1 0xb7ed6a15 in gc_mark_children (ptr=3050895040, lev=1) at gc.c:945 #2 0xb7ed6c49 in mark_locations_array (x=0xbfc32f90, n=39) at gc.c:629 #3 0xb7ed6e17 in garbage_collect () at gc.c:1366 #4 0xb7ed79c5 in ruby_xmalloc (size=48) at gc.c:103 #5 0xb71855fa in xmlNewPropInternal (node=0xa20a190, ns=0x0, name=0x84983b0 "k", value=0xa20a170 "created_by", eatname=0) at tree.c:1791 #6 0xb727ce48 in rxml_attr_initialize (argc=3, argv=0xbfc333a0, self=3050895040) at ruby_xml_attr.c:104 #7 0xb7eb8815 in call_cfunc (func=0xb727cd30 , recv=3050895040, len=34, argc=3, argv=0xb727d100) at eval.c:5691 #8 0xb7ec0f1d in rb_call0 (klass=3075364500, recv=3050895040, id=2961, oid=2961, argc=3, argv=0xbfc333a0, body=0xb74e561c, flags=0) at eval.c:5846 #9 0xb7ec11d8 in rb_call (klass=3075364500, recv=3050895040, mid=2961, argc=3, argv=0xbfc333a0, scope=1, self=6) at eval.c:6093 #10 0xb7ec14e7 in rb_obj_call_init (obj=3050895040, argc=3, argv=0xbfc333a0) at eval.c:7625 #11 0xb7eef43a in rb_class_new_instance (argc=3, argv=0xbfc333a0, klass=3075364500) at object.c:1594 #12 0xb726cc63 in rxml_attributes_attribute_set (self=3050895060, name=3050895080, value=3050895220) at ruby_xml_attributes.c:176 #13 0xb726f5e3 in rxml_node_property_set (self=3050895120, name=3050895080, value=3050895220) at ruby_xml_node.c:1106 As you can see it is being asked to mark an attribute, but the attribute pointer it is given is null. Any ideas? Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] SEGV with libxml-ruby 1.0.0
Tom Hughes wrote: Having upgraded to libxml-ruby 1.0.0 yesterday I am now seeing repeatable crashes in the garbage collection. The end of the trace looks like: #0 rxml_attr_mark (xattr=0x0) at ruby_xml_attr.c:41 #1 0xb7ed6a15 in gc_mark_children (ptr=3050895040, lev=1) at gc.c:945 #2 0xb7ed6c49 in mark_locations_array (x=0xbfc32f90, n=39) at gc.c:629 #3 0xb7ed6e17 in garbage_collect () at gc.c:1366 #4 0xb7ed79c5 in ruby_xmalloc (size=48) at gc.c:103 #5 0xb71855fa in xmlNewPropInternal (node=0xa20a190, ns=0x0, name=0x84983b0 "k", value=0xa20a170 "created_by", eatname=0) at tree.c:1791 One of my colleagues thinks he has spotted the problem: "the attribute is being allocated (rxml_attr_alloc), which sets the data pointer to NULL. almost immediately, the initialise method is called (rxml_attr_initialize) where the self object has that NULL data pointer. when it gets to xmlNewProp it triggers the GC which tries to mark the current object, which hasn't finished initialising yet... Making rxml_attr_mark() return immediately if xattr is null seems to have stopped it segving anyway - the mark routine for nodes is already doing that in fact. Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] SEGV with libxml-ruby 1.0.0
Hi Tom, Having upgraded to libxml-ruby 1.0.0 yesterday I am now seeing repeatable crashes in the garbage collection. The end of the trace looks like: #0 rxml_attr_mark (xattr=0x0) at ruby_xml_attr.c:41 #1 0xb7ed6a15 in gc_mark_children (ptr=3050895040, lev=1) at gc.c:945 #2 0xb7ed6c49 in mark_locations_array (x=0xbfc32f90, n=39) at gc.c:629 #3 0xb7ed6e17 in garbage_collect () at gc.c:1366 #4 0xb7ed79c5 in ruby_xmalloc (size=48) at gc.c:103 #5 0xb71855fa in xmlNewPropInternal (node=0xa20a190, ns=0x0, name=0x84983b0 "k", value=0xa20a170 "created_by", eatname=0) at tree.c:1791 #6 0xb727ce48 in rxml_attr_initialize (argc=3, argv=0xbfc333a0, self=3050895040) at ruby_xml_attr.c:104 #7 0xb7eb8815 in call_cfunc (func=0xb727cd30 , recv=3050895040, len=34, argc=3, argv=0xb727d100) at eval.c:5691 #8 0xb7ec0f1d in rb_call0 (klass=3075364500, recv=3050895040, id=2961, oid=2961, argc=3, argv=0xbfc333a0, body=0xb74e561c, flags=0) at eval.c:5846 #9 0xb7ec11d8 in rb_call (klass=3075364500, recv=3050895040, mid=2961, argc=3, argv=0xbfc333a0, scope=1, self=6) at eval.c:6093 #10 0xb7ec14e7 in rb_obj_call_init (obj=3050895040, argc=3, argv=0xbfc333a0) at eval.c:7625 #11 0xb7eef43a in rb_class_new_instance (argc=3, argv=0xbfc333a0, klass=3075364500) at object.c:1594 #12 0xb726cc63 in rxml_attributes_attribute_set (self=3050895060, name=3050895080, value=3050895220) at ruby_xml_attributes.c:176 #13 0xb726f5e3 in rxml_node_property_set (self=3050895120, name=3050895080, value=3050895220) at ruby_xml_node.c:1106 As you can see it is being asked to mark an attribute, but the attribute pointer it is given is null. Yeah, this should be easy to track down. Is it 1.9.1 by chance? Do you have a test case? Charlie smime.p7s Description: S/MIME Cryptographic Signature ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] SEGV with libxml-ruby 1.0.0
Charlie Savage wrote: As you can see it is being asked to mark an attribute, but the attribute pointer it is given is null. Yeah, this should be easy to track down. Is it 1.9.1 by chance? Do you have a test case? No, it's 1.8.6 actually. I don't have a simple test case I'm afraid - it was happening on the openstreetmap code base against our live database... Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] SEGV with libxml-ruby 1.0.0
#0 rxml_attr_mark (xattr=0x0) at ruby_xml_attr.c:41 #1 0xb7ed6a15 in gc_mark_children (ptr=3050895040, lev=1) at gc.c:945 #2 0xb7ed6c49 in mark_locations_array (x=0xbfc32f90, n=39) at gc.c:629 #3 0xb7ed6e17 in garbage_collect () at gc.c:1366 #4 0xb7ed79c5 in ruby_xmalloc (size=48) at gc.c:103 #5 0xb71855fa in xmlNewPropInternal (node=0xa20a190, ns=0x0, name=0x84983b0 "k", value=0xa20a170 "created_by", eatname=0) at tree.c:1791 One of my colleagues thinks he has spotted the problem: "the attribute is being allocated (rxml_attr_alloc), which sets the data pointer to NULL. almost immediately, the initialise method is called (rxml_attr_initialize) where the self object has that NULL data pointer. when it gets to xmlNewProp it triggers the GC which tries to mark the current object, which hasn't finished initialising yet... This is an interesting one, and the diagnosis is correct. It is caused by having libxml use ruby's memory allocator. 1. Create a new ruby attribute object 2. Initialize it 3. Call libxml xmlNewNsProp 4. It asks ruby for memory 5. Ruby runs a gc since it has no memory 6. The newly created, but not initialized attribute has its mark function called The reason for that change was it greatly reduced libxml-ruby's memory usage. Making rxml_attr_mark() return immediately if xattr is null seems to have stopped it segving anyway - the mark routine for nodes is already doing that in fact. Yes, that would work. But I wonder if there are other cases of something similar happening. I think you're seeing it because you probably have a lot higher load on your server then what we test with. I tried to duplicate the issue but didn't succeed: def test_high_allocations node = XML::Node.new('test') 1.upto(10) do |i| name = "attr_#{i}" XML::Attr.new(node, name, i.to_s) end assert(true) end Anyway, I added the check and went back to using libxml's internal memory allocator. Fixes in 1.1.0 which I just uploaded. Let me know how it goes. Charlie smime.p7s Description: S/MIME Cryptographic Signature ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
[libxml-devel] Announcing libxml-ruby 1.1.0
Well, that didn't last quite as long as I had hoped. A new version of libxml-ruby, version 1.1.0, is now available. It includes two changes: * Fix bug caused by the mark function being called on partially initialized attributes. * Revert back to libxml2's internal memory manager. These changes hopefully fix the issue reported earlier today by Tom. Charlie smime.p7s Description: S/MIME Cryptographic Signature ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] SEGV with libxml-ruby 1.0.0
What a shame. I think we should investigate alternatives, but high memory usage is better than crashing. :'( I should point out that we also have high load but we haven't experienced any segfaulting - so it may be configuration dependent, but not necessarily I'll try and take a look at this when I get the chance to see if we can find a way to have our cake and eat it too. - Joe Khoobyar Charlie Savage wrote: #0 rxml_attr_mark (xattr=0x0) at ruby_xml_attr.c:41 #1 0xb7ed6a15 in gc_mark_children (ptr=3050895040, lev=1) at gc.c:945 #2 0xb7ed6c49 in mark_locations_array (x=0xbfc32f90, n=39) at gc.c:629 #3 0xb7ed6e17 in garbage_collect () at gc.c:1366 #4 0xb7ed79c5 in ruby_xmalloc (size=48) at gc.c:103 #5 0xb71855fa in xmlNewPropInternal (node=0xa20a190, ns=0x0, name=0x84983b0 "k", value=0xa20a170 "created_by", eatname=0) at tree.c:1791 One of my colleagues thinks he has spotted the problem: "the attribute is being allocated (rxml_attr_alloc), which sets the data pointer to NULL. almost immediately, the initialise method is called (rxml_attr_initialize) where the self object has that NULL data pointer. when it gets to xmlNewProp it triggers the GC which tries to mark the current object, which hasn't finished initialising yet... This is an interesting one, and the diagnosis is correct. It is caused by having libxml use ruby's memory allocator. 1. Create a new ruby attribute object 2. Initialize it 3. Call libxml xmlNewNsProp 4. It asks ruby for memory 5. Ruby runs a gc since it has no memory 6. The newly created, but not initialized attribute has its mark function called The reason for that change was it greatly reduced libxml-ruby's memory usage. Making rxml_attr_mark() return immediately if xattr is null seems to have stopped it segving anyway - the mark routine for nodes is already doing that in fact. Yes, that would work. But I wonder if there are other cases of something similar happening. I think you're seeing it because you probably have a lot higher load on your server then what we test with. I tried to duplicate the issue but didn't succeed: def test_high_allocations node = XML::Node.new('test') 1.upto(10) do |i| name = "attr_#{i}" XML::Attr.new(node, name, i.to_s) end assert(true) end Anyway, I added the check and went back to using libxml's internal memory allocator. Fixes in 1.1.0 which I just uploaded. Let me know how it goes. Charlie ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] SEGV with libxml-ruby 1.0.0
What a shame. I think we should investigate alternatives, but high memory usage is better than crashing. :'( My thinking on this is go back through all the mark and free functions and see which other ones theoretically could have the problem. Then having done that review, give it another try. I should point out that we also have high load but we haven't experienced any segfaulting - so it may be configuration dependent, but not necessarily Probably so - I ran a test script to create 10k attributes to force the GC to happen, but couldn't reproduce it. But the stack trace is pretty clear on what's happening. Could be the Ruby version, or the OS, or the configuration. I'll try and take a look at this when I get the chance to see if we can find a way to have our cake and eat it too. Yeah, see above. I think is probably doable. Charlie smime.p7s Description: S/MIME Cryptographic Signature ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
[libxml-devel] libxml-ruby 1.0.0 fails to install with libxml 2.6.27
There is some code in ext/libxml/ruby_xml_html_parser_context.c that defines a couple of functions for older versions of libxml that are included in newer versions. The comment says the functions were added to libxml in 2.6.27 but the code is included for all versions <= 2.6.27 so if you happen to have exactly 2.6.27 on your system then you get a double definition. Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] libxml-ruby 1.0.0 fails to install with libxml 2.6.27
There is some code in ext/libxml/ruby_xml_html_parser_context.c that defines a couple of functions for older versions of libxml that are included in newer versions. Right, those were added for OS X compatibility, since it has version 2.6.16. They actually exist in older libxml versions, but are not exposed in the header files. Thus my ugly hack to just copy them over to ruby_xml_html_parser_context.c. The comment says the functions were added to libxml in 2.6.27 but the code is included for all versions <= 2.6.27 so if you happen to have exactly 2.6.27 on your system then you get a double definition. Yup, should be < not <=. I take it you are using 2.6.27? Fix checked into trunk. Charlie smime.p7s Description: S/MIME Cryptographic Signature ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel
Re: [libxml-devel] libxml-ruby 1.0.0 fails to install with libxml 2.6.27
Charlie Savage wrote: The comment says the functions were added to libxml in 2.6.27 but the code is included for all versions <= 2.6.27 so if you happen to have exactly 2.6.27 on your system then you get a double definition. Yup, should be < not <=. I take it you are using 2.6.27? Well one of our machines was, yes. That seems to be the latest version in Etch. Tom -- Tom Hughes (t...@compton.nu) http://www.compton.nu/ ___ libxml-devel mailing list libxml-devel@rubyforge.org http://rubyforge.org/mailman/listinfo/libxml-devel