[ https://issues.apache.org/jira/browse/LUCY-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361308#comment-16361308 ]
Serkan Mulayim commented on LUCY-326: ------------------------------------- Thanks [~nwellnhof], my apologies for the latency in getting back to you. Can you help me understand the steps to update the library. I could not find the exact same filename. But I found /modules/analysis/snowstem/source/libstemmer/libstemmer_utf8.c . I believe this is the one. So do you suggest me to apply the patch for this file? > C lib: Possible memory leak in SnowStemmer when provided schema for the > indexer is not DECREFFED > ------------------------------------------------------------------------------------------------ > > Key: LUCY-326 > URL: https://issues.apache.org/jira/browse/LUCY-326 > Project: Lucy > Issue Type: Bug > Components: C bindings > Affects Versions: 0.6.1 > Environment: linux > Reporter: Serkan Mulayim > Priority: Major > > In my C library I create a static global struct (which contains some runtime > variables as well as lucy_Schema pointer) which is created when the program > is loaded. There is also a destroy function which cleans up (also DECREFs > the schema) the runtime data. When I index some documents by providing this > schema to the indexer, and call destroy function before the program (using > the lib) exits, I do not see any memory leaks in the valgrind output. I only > see (still reachable has some non-zero values due to lucy_bootstrap_parcel > function). > On the other hand if I do not call the destroy function before the exit, I > would expect to see only an increase in "still reachable" block in valgrind > output, but I also see "possibly lost" as following: > --------------------------------------------------------------------------------------------------- > ==16942== 70 bytes in 1 blocks are possibly lost in loss record 147 of 178 > ==16942== at 0x4C29B78: realloc (vg_replace_malloc.c:785) > ==16942== by 0x4F86CC4: increase_size (utilities.c:332) > ==16942== by 0x4F87865: replace_s (utilities.c:360) > ==16942== by 0x4EF4195: SN_set_current (api.c:62) > ==16942== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80) > ==16942== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80) > ==16942== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197) > ==16942== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP > (PolyAnalyzer.c:110) > ==16942== by 0x4F15368: LUCY_Analyzer_Transform_Text (Analyzer.h:204) > ==16942== by 0x4F15368: LUCY_Inverter_Add_Field_IMP (Inverter.c:181) > ==16942== by 0x4F14E91: LUCY_Inverter_Add_Field (Inverter.h:296) > ==16942== by 0x4F14E91: LUCY_Inverter_Invert_Doc_IMP (Inverter.c:109) > ==16942== by 0x4F63164: LUCY_Inverter_Invert_Doc (Inverter.h:275) > ==16942== by 0x4F63164: LUCY_SegWriter_Add_Doc_IMP (SegWriter.c:109) > ==16942== by 0x4F7E069: LUCY_Indexer_Add_Doc (Indexer.h:260) > ==16942== by 0x4F7F23F: index_messages_json (Search.c:432) > ==16942== > ==16942== LEAK SUMMARY: > ==16942== definitely lost: 0 bytes in 0 blocks > ==16942== indirectly lost: 0 bytes in 0 blocks > ==16942== possibly lost: 70 bytes in 1 blocks > ==16942== still reachable: 246,683 bytes in 5,077 blocks > ==16942== suppressed: 0 bytes in 0 blocks > --------------------------------------------------------------------------------------------------- > Similarly for another program where I do only search (not indexing), I see > the similar behaviour. Valgrind output is below for that one: > ----------------------------------------------------------------------------------------------------- > ==16949== > ==16949== HEAP SUMMARY: > ==16949== in use at exit: 229,312 bytes in 5,061 blocks > ==16949== total heap usage: 34,993 allocs, 29,932 frees, 1,791,083 bytes > allocated > ==16949== > ==16949== 37 bytes in 1 blocks are possibly lost in loss record 96 of 177 > ==16949== at 0x4C29B78: realloc (vg_replace_malloc.c:785) > ==16949== by 0x4F86CC4: increase_size (utilities.c:332) > ==16949== by 0x4F87865: replace_s (utilities.c:360) > ==16949== by 0x4EF4195: SN_set_current (api.c:62) > ==16949== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80) > ==16949== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80) > ==16949== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197) > ==16949== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP > (PolyAnalyzer.c:110) > ==16949== by 0x4EF35F3: LUCY_Analyzer_Transform_Text (Analyzer.h:204) > ==16949== by 0x4EF35F3: LUCY_Analyzer_Split_IMP (Analyzer.c:48) > ==16949== by 0x4F5AAC8: LUCY_Analyzer_Split (Analyzer.h:211) > ==16949== by 0x4F5AAC8: LUCY_QParser_Expand_Leaf_IMP (QueryParser.c:916) > ==16949== by 0x4F59ECA: LUCY_QParser_Expand (QueryParser.h:298) > ==16949== by 0x4F59ECA: LUCY_QParser_Parse_IMP (QueryParser.c:207) > ==16949== by 0x4F7E358: LUCY_QParser_Parse (QueryParser.h:284) > ==16949== by 0x4F7F492: get_query (Search.c:483) > ==16949== > ==16949== LEAK SUMMARY: > ==16949== definitely lost: 0 bytes in 0 blocks > ==16949== indirectly lost: 0 bytes in 0 blocks > ==16949== possibly lost: 37 bytes in 1 blocks > ==16949== still reachable: 229,275 bytes in 5,060 blocks > ==16949== suppressed: 0 bytes in 0 blocks > ==16949== Reachable blocks (those to which a pointer was found) are not > shown. > ==16949== To see them, rerun with: --leak-check=full --show-leak-kinds=all > ---------------------------------------------------------------------------------------------------- > *If I remove the SnowStemmer from the Analyzers, I see that this issue does > not happen( and I only see still reachable is non-zero)* > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)