[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15158113#comment-15158113
 ] 

Gus Heck commented on SOLR-8349:
--------------------------------

Looked at the patch some today, Mostly it looks easy enough to adapt to my 
client's code, but I think there might be some holes WRT decoders decoding the 
same content more than once when multiple cores are loaded, and it seems we 
hold onto the ByteBuffer after the decoding, which doubles memory usage. Will 
comment more and provide suggestions (+ patch) tomorrow. Since decoding our 
file takes significant time and pegs the cpu's, I really don't want that 
repeating itself for all 40 cores :).

Loading code in my decoder will look something like this:
{code}
      ForkJoinPool pool = new 
ForkJoinPool(Runtime.getRuntime().availableProcessors());
      try (Stream<String> lines = new BufferedReader(new 
InputStreamReader(inputStream, Charset.forName("UTF-8"))).lines()) {
        try {
          pool.submit(() -> 
lines.parallel().forEach(this::processSimpleCsvRow)).get();
        } catch (InterruptedException | ExecutionException e) {
          throw new IOException(e);
        }
      } catch (IOException e) {
        throw new RuntimeException("Cannot load  csv " , e);
      } finally {
        pool.shutdownNow();
      }
{code}


> Allow sharing of large in memory data structures across cores
> -------------------------------------------------------------
>
>                 Key: SOLR-8349
>                 URL: https://issues.apache.org/jira/browse/SOLR-8349
>             Project: Solr
>          Issue Type: Improvement
>          Components: Server
>    Affects Versions: 5.3
>            Reporter: Gus Heck
>            Assignee: Noble Paul
>         Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to