[ https://issues.apache.org/jira/browse/TINKERPOP-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190994#comment-15190994 ]
Marko A. Rodriguez commented on TINKERPOP-1217: ----------------------------------------------- By chance do you know where in the Spark job this was happening for you? This really should be initialized (by Spark) and I suspect that we have process that is not initializing the pool before use. Reviewing the code, I note that we have a {{reduceByKey}} that didn't have the initialization. Do you know if this is being spit out during the "message pass" phase? > Repeated Logging of "The HadoopPools has not been initialized, using the > default pool" > --------------------------------------------------------------------------------------- > > Key: TINKERPOP-1217 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1217 > Project: TinkerPop > Issue Type: Bug > Components: hadoop > Affects Versions: 3.1.1-incubating > Reporter: Russell Alexander Spitzer > > When running a Spark Job against a rather large database my spark log fills > with the following log line repeatedly > {code}WARN 2016-03-10 15:58:20,123 HadoopPools.java:55 - > org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph: The HadoopPools > has not been initialized, using the default pool{code} > This amounted to about 5GB of logging per Spark Executor over the course of > 90minutes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)