[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438509#comment-16438509 ] Sercan Karaoglu commented on SPARK-23891: - So to summarize, as a user I want to have two things one is official spark images with all kinds of tags and second is I would like to customize those images in such a way that I can add my jars into it and seperate class loader loads them so that I have no conflicts with existing spark classpath. Existing classes may be shaded or not but either way app layer and spark layer should be isolated from each other. > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > Attachments: Dockerfile > > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438500#comment-16438500 ] Sercan Karaoglu commented on SPARK-23891: - I don't know if you guys want to do this but what I would suggest would be; if you take a look at here [https://hub.docker.com/r/library/openjdk/] , they have jdk and all kinds of tags to determine the underlying platform, because spark is another layer on top of jvm, there could have been an option to choose spark version plus jdk and distro version from docker-hub as official images and I think this should not be that hard since we have cool CI/CD tools today that can automate pretty much everything. If you look at docker hub there is no official supported spark images there yet. > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > Attachments: Dockerfile > > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438499#comment-16438499 ] Sercan Karaoglu commented on SPARK-23891: - Sure! I've just attached it, and as a reference, this is another workaround to get netty-tcnative running in docker using alpine images. [https://github.com/pires/netty-tcnative-alpine] . > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > Attachments: Dockerfile > > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438410#comment-16438410 ] Erik Erlandson commented on SPARK-23891: [~SercanKaraoglu] thanks for the information! You are correct; Spark also has a netty dep. Can you attach your customized docker file to this JIRA? That would be a very useful reference for our ongoing container image discussions. > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436942#comment-16436942 ] Sercan Karaoglu commented on SPARK-23891: - Debian and centos based linux distros are the most popular ones even if nowadays we see people tend to use alpine based jdk images because of their size, we still have most libraries like netty has their native bindings built in both for centos and debian. As far as I know spark also have netty dependency, if you want to get less gc pressure more performance from the TCP layer you can add netty linux native bindings to the classpath through the jars that you can find most repos like maven central, then Betty automatically binds to those .so's my problem was specific to SSL communication with google big table, in this case google depends on netty tcnative library in their SDK. I had two ways to solve this problem, one is rebuilt the tcnative for alpine and exclude .so from classpath coming through existing jars and add my custom built .so to the classpath, the other way was to change base image which is way easier. I solved this problem by customizing that dockerfile you provide as a reference and would like to report the issue here > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436196#comment-16436196 ] Erik Erlandson commented on SPARK-23891: I do think that these reports are very useful for collecting data on community use cases. Is this incompatability something fundamental to alpine that can only be fixed via debian, or is it possible to hack the alpine build to fix it? > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436185#comment-16436185 ] Erik Erlandson commented on SPARK-23891: The question of what OS base to use for "canonical" images or dockerfiles is an open one. The use of alpine was influenced by the relatively small image size that resulted. We could entertain arguments about why debian, centos, or some other OS, might be an advantage. The current position of the Apache Spark project is that the dockerfiles shipped with the project are for reference, and as an aid to users building their own images for use with the kubernetes back-end. IMO, the project should not get into the business of supporting _multiple_ dockerfiles at the present time. In the future, if/when the "container image api" stabilizes further, we might reconsider maintaining multiple dockerfiles. I'm interested if others have different point of view; my take currently is that if users would like to construct similar dockerfiles using an alternative base OS, it would be great to publish that as a github project where interested community members could use it. > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: New Feature > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Minor > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23891) Debian based Dockerfile
[ https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436092#comment-16436092 ] Anirudh Ramanathan commented on SPARK-23891: [~eje] has done a lot of research on these images and dependencies. PTAL > Debian based Dockerfile > --- > > Key: SPARK-23891 > URL: https://issues.apache.org/jira/browse/SPARK-23891 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 >Reporter: Sercan Karaoglu >Priority: Major > > Current dockerfile inherits from alpine linux which causes netty tcnative ssl > bindings to fail while loading which is the case when we use Google Cloud > Platforms Bigtable Client on top of spark cluster. would be better to have > another debian based dockerfile -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org