[ 
https://issues.apache.org/jira/browse/S2GRAPH-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16431985#comment-16431985
 ] 

Chul Kang commented on S2GRAPH-205:
-----------------------------------

We already have another class that initializes S2graph only one.
{code:java}
object S2SinkContext {
  private var s2SinkContext:S2SinkContext = null

  def apply(config:Config):S2SinkContext = {
    if (s2SinkContext == null) {
      s2SinkContext = new S2SinkContext(config)
    }
    s2SinkContext
  }
}

{code}

Using this class, we can change the above code like this,
{code:java}
df.foreachPartition { iters =>
  val config = ConfigFactory.parseString(serializedConfig)
  val s2Graph = S2SinkContext(config).getGraph
  ...
}{code}

It will initialize S2Graph on each executor once.

 

> too many initialize S2Graph when writeBatchMutate on S2GraphSink
> ----------------------------------------------------------------
>
>                 Key: S2GRAPH-205
>                 URL: https://issues.apache.org/jira/browse/S2GRAPH-205
>             Project: S2Graph
>          Issue Type: Sub-task
>          Components: s2jobs
>            Reporter: Chul Kang
>            Assignee: Chul Kang
>            Priority: Minor
>
> When call the function S2GraphHelper.initS2Graph() , S2Graph is initialized 
> every time.
> It causes initialize the Model class, so many connections can be created to 
> DB.
> In especially, when you call writeBatchWithMutate on the S2graphSink class, 
> the following code initializes S2Graph for each task.
> {code:java}
> df.foreachPartition { iters =>
>   val config = ConfigFactory.parseString(serializedConfig)
>   val s2Graph = S2GraphHelper.initS2Graph(config)
>   ...
> }
> {code}
>  
> I think it would be better if we can re-use S2Graph instance on the same 
> executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to