[ 
https://issues.apache.org/jira/browse/LUCY-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361308#comment-16361308
 ] 

Serkan Mulayim commented on LUCY-326:
-------------------------------------

Thanks [~nwellnhof], my apologies for the latency in getting back to you. Can 
you help me understand the steps to update the library. I could not find the 
exact same filename. But I found 
/modules/analysis/snowstem/source/libstemmer/libstemmer_utf8.c . I believe this 
is the one. So do you suggest me to apply the patch for this file?

 

> C lib: Possible memory leak in SnowStemmer when provided schema for the 
> indexer is not DECREFFED
> ------------------------------------------------------------------------------------------------
>
>                 Key: LUCY-326
>                 URL: https://issues.apache.org/jira/browse/LUCY-326
>             Project: Lucy
>          Issue Type: Bug
>          Components: C bindings
>    Affects Versions: 0.6.1
>         Environment: linux
>            Reporter: Serkan Mulayim
>            Priority: Major
>
> In my C library I create a static global struct (which contains some runtime 
> variables as well as lucy_Schema pointer) which is created when the program 
> is loaded.  There is also a destroy function which cleans up (also DECREFs 
> the schema) the runtime data. When I index some documents by providing this 
> schema to the indexer, and call destroy function before the program (using 
> the lib) exits, I do not see any memory leaks in the valgrind output. I only 
> see (still reachable has some non-zero values due to lucy_bootstrap_parcel 
> function).
> On the other hand if I do not call the destroy function before the exit, I 
> would expect to see only an increase in "still reachable" block in valgrind 
> output, but I also see "possibly lost" as following:
> ---------------------------------------------------------------------------------------------------
> ==16942== 70 bytes in 1 blocks are possibly lost in loss record 147 of 178
>  ==16942== at 0x4C29B78: realloc (vg_replace_malloc.c:785)
>  ==16942== by 0x4F86CC4: increase_size (utilities.c:332)
>  ==16942== by 0x4F87865: replace_s (utilities.c:360)
>  ==16942== by 0x4EF4195: SN_set_current (api.c:62)
>  ==16942== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80)
>  ==16942== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80)
>  ==16942== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197)
>  ==16942== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP 
> (PolyAnalyzer.c:110)
>  ==16942== by 0x4F15368: LUCY_Analyzer_Transform_Text (Analyzer.h:204)
>  ==16942== by 0x4F15368: LUCY_Inverter_Add_Field_IMP (Inverter.c:181)
>  ==16942== by 0x4F14E91: LUCY_Inverter_Add_Field (Inverter.h:296)
>  ==16942== by 0x4F14E91: LUCY_Inverter_Invert_Doc_IMP (Inverter.c:109)
>  ==16942== by 0x4F63164: LUCY_Inverter_Invert_Doc (Inverter.h:275)
>  ==16942== by 0x4F63164: LUCY_SegWriter_Add_Doc_IMP (SegWriter.c:109)
>  ==16942== by 0x4F7E069: LUCY_Indexer_Add_Doc (Indexer.h:260)
>  ==16942== by 0x4F7F23F: index_messages_json (Search.c:432)
>  ==16942==
>  ==16942== LEAK SUMMARY:
>  ==16942== definitely lost: 0 bytes in 0 blocks
>  ==16942== indirectly lost: 0 bytes in 0 blocks
>  ==16942== possibly lost: 70 bytes in 1 blocks
>  ==16942== still reachable: 246,683 bytes in 5,077 blocks
>  ==16942== suppressed: 0 bytes in 0 blocks
> ---------------------------------------------------------------------------------------------------
> Similarly for another program where I do only search (not indexing), I see 
> the similar behaviour. Valgrind output is below for that one:
> -----------------------------------------------------------------------------------------------------
> ==16949==
>  ==16949== HEAP SUMMARY:
>  ==16949== in use at exit: 229,312 bytes in 5,061 blocks
>  ==16949== total heap usage: 34,993 allocs, 29,932 frees, 1,791,083 bytes 
> allocated
>  ==16949==
>  ==16949== 37 bytes in 1 blocks are possibly lost in loss record 96 of 177
>  ==16949== at 0x4C29B78: realloc (vg_replace_malloc.c:785)
>  ==16949== by 0x4F86CC4: increase_size (utilities.c:332)
>  ==16949== by 0x4F87865: replace_s (utilities.c:360)
>  ==16949== by 0x4EF4195: SN_set_current (api.c:62)
>  ==16949== by 0x4F44644: sb_stemmer_stem (libstemmer_utf8.c:80)
>  ==16949== by 0x4F65723: LUCY_SnowStemmer_Transform_IMP (SnowballStemmer.c:80)
>  ==16949== by 0x4F4FA69: LUCY_Analyzer_Transform (Analyzer.h:197)
>  ==16949== by 0x4F4FA69: LUCY_PolyAnalyzer_Transform_Text_IMP 
> (PolyAnalyzer.c:110)
>  ==16949== by 0x4EF35F3: LUCY_Analyzer_Transform_Text (Analyzer.h:204)
>  ==16949== by 0x4EF35F3: LUCY_Analyzer_Split_IMP (Analyzer.c:48)
>  ==16949== by 0x4F5AAC8: LUCY_Analyzer_Split (Analyzer.h:211)
>  ==16949== by 0x4F5AAC8: LUCY_QParser_Expand_Leaf_IMP (QueryParser.c:916)
>  ==16949== by 0x4F59ECA: LUCY_QParser_Expand (QueryParser.h:298)
>  ==16949== by 0x4F59ECA: LUCY_QParser_Parse_IMP (QueryParser.c:207)
>  ==16949== by 0x4F7E358: LUCY_QParser_Parse (QueryParser.h:284)
>  ==16949== by 0x4F7F492: get_query (Search.c:483)
>  ==16949==
>  ==16949== LEAK SUMMARY:
>  ==16949== definitely lost: 0 bytes in 0 blocks
>  ==16949== indirectly lost: 0 bytes in 0 blocks
>  ==16949== possibly lost: 37 bytes in 1 blocks
>  ==16949== still reachable: 229,275 bytes in 5,060 blocks
>  ==16949== suppressed: 0 bytes in 0 blocks
>  ==16949== Reachable blocks (those to which a pointer was found) are not 
> shown.
>  ==16949== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ----------------------------------------------------------------------------------------------------
> *If I remove the SnowStemmer from the Analyzers, I see that this issue does 
> not happen( and I only see still reachable is non-zero)*
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to