[jira] [Commented] (SPARK-12414) Remove closure serializer
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054529#comment-16054529 ] Ritesh Tijoriwala commented on SPARK-12414: --- I have a similar situation. I have several classes that I would like to instantiate and use in Executors for e.g. DB connections, Elasticsearch clients, etc. I don't want to write instantiation code in Spark functions and use "statics". There was a nit trick suggested here - https://issues.apache.org/jira/browse/SPARK-650 but seems like this will not work starting 2.0.0 as a consequence of this ticket. Could anybody from Spark community recommend how to do some initialization on each Spark executor for the job before any task execution begins? > Remove closure serializer > - > > Key: SPARK-12414 > URL: https://issues.apache.org/jira/browse/SPARK-12414 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.0.0 >Reporter: Andrew Or >Assignee: Sean Owen > Fix For: 2.0.0 > > > There is a config `spark.closure.serializer` that accepts exactly one value: > the java serializer. This is because there are currently bugs in the Kryo > serializer that make it not a viable candidate. This was uncovered by an > unsuccessful attempt to make it work: SPARK-7708. > My high level point is that the Java serializer has worked well for at least > 6 Spark versions now, and it is an incredibly complicated task to get other > serializers (not just Kryo) to work with Spark's closures. IMO the effort is > not worth it and we should just remove this documentation and all the code > associated with it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053745#comment-16053745 ] Ritesh Tijoriwala commented on SPARK-650: - [~Skamandros] - Any similar tricks for spark 2.0.0? I see the config option to set the closure serializer has been removed - https://issues.apache.org/jira/browse/SPARK-12414. Currently we do "set of different things" to ensure our classes are loaded/instantiated before spark starts execution of its stages. It would be nice to consolidate this in one place/hook. > Add a "setup hook" API for running initialization code on each executor > --- > > Key: SPARK-650 > URL: https://issues.apache.org/jira/browse/SPARK-650 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: Matei Zaharia >Priority: Minor > > Would be useful to configure things like reporting libraries -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969422#comment-15969422 ] Ritesh Tijoriwala commented on SPARK-650: - [~Skamandros] - I would also like to know about hooking 'JavaSerializer'. I have a similar use case where I need to initialize set of objects/resources on each executor. I would also like to know if anybody has a way to hook into some "clean up" on each executor when 1) the executor shutdown 2) when a batch finishes and before next batch starts > Add a "setup hook" API for running initialization code on each executor > --- > > Key: SPARK-650 > URL: https://issues.apache.org/jira/browse/SPARK-650 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Reporter: Matei Zaharia >Priority: Minor > > Would be useful to configure things like reporting libraries -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org