[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-14 Thread Sercan Karaoglu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438509#comment-16438509
 ] 

Sercan Karaoglu commented on SPARK-23891:
-

So to summarize, as a user I want to have two things one is official spark 
images with all kinds of tags and second is I would like to customize those 
images in such a way that I can add my jars into it and seperate class loader 
loads them so that I have no conflicts with existing spark classpath. Existing 
classes may be shaded or not but either way app layer and spark layer should be 
isolated from each other.

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
> Attachments: Dockerfile
>
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-14 Thread Sercan Karaoglu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438500#comment-16438500
 ] 

Sercan Karaoglu commented on SPARK-23891:
-

I don't know if you guys want to do this but what I would suggest would be; if 
you take a look at here [https://hub.docker.com/r/library/openjdk/] , they have 
jdk and all kinds of tags to determine the underlying platform, because spark 
is another layer on top of jvm, there could have been an option to choose spark 
version plus jdk and distro version from docker-hub as official images and I 
think this should not be that hard since we have cool CI/CD tools today that 
can automate pretty much everything. If you look at docker hub there is no 
official supported spark images there yet.

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
> Attachments: Dockerfile
>
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-14 Thread Sercan Karaoglu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438499#comment-16438499
 ] 

Sercan Karaoglu commented on SPARK-23891:
-

Sure! I've just attached it, and as a reference, this is another workaround to 
get netty-tcnative running in docker using alpine images. 
[https://github.com/pires/netty-tcnative-alpine] . 

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
> Attachments: Dockerfile
>
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-14 Thread Erik Erlandson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438410#comment-16438410
 ] 

Erik Erlandson commented on SPARK-23891:


[~SercanKaraoglu] thanks for the information! You are correct; Spark also has a 
netty dep. Can you attach your customized docker file to this JIRA? That would 
be a very useful reference for our ongoing container image discussions.

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-13 Thread Sercan Karaoglu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436942#comment-16436942
 ] 

Sercan Karaoglu commented on SPARK-23891:
-

Debian and centos based linux distros are the most popular ones even if 
nowadays we see people tend to use alpine based jdk images because of their 
size, we still have most libraries like netty has their native bindings built 
in both for centos and debian. As far as I know spark also have netty 
dependency, if you want to get less gc pressure more performance from the TCP 
layer you can add netty linux native bindings to the classpath through the jars 
that you can find most repos like maven central, then Betty automatically binds 
to those .so's

my problem was specific to SSL communication with google big table, in this 
case google depends on netty tcnative library in their SDK. I had two ways to 
solve this problem, one is rebuilt the tcnative for alpine and exclude .so from 
classpath coming through existing jars and add my custom built .so to the 
classpath, the other way was to change base image which is way easier. I solved 
this problem by customizing that dockerfile you provide as a reference and 
would like to report the issue here

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-12 Thread Erik Erlandson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436196#comment-16436196
 ] 

Erik Erlandson commented on SPARK-23891:


I do think that these reports are very useful for collecting data on community 
use cases. Is this incompatability something fundamental to alpine that can 
only be fixed via debian, or is it possible to hack the alpine build to fix it?

 

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-12 Thread Erik Erlandson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436185#comment-16436185
 ] 

Erik Erlandson commented on SPARK-23891:


The question of what OS base to use for "canonical" images or dockerfiles is an 
open one. The use of alpine was influenced by the relatively small image size 
that resulted. We could entertain arguments about why debian, centos, or some 
other OS, might be an advantage.

The current position of the Apache Spark project is that the dockerfiles 
shipped with the project are for reference, and as an aid to users building 
their own images for use with the kubernetes back-end.  IMO, the project should 
not get into the business of supporting _multiple_ dockerfiles at the present 
time. In the future, if/when the "container image api" stabilizes further, we 
might reconsider maintaining multiple dockerfiles.

I'm interested if others have different point of view; my take currently is 
that if users would like to construct similar dockerfiles using an alternative 
base OS, it would be great to publish that as a github project where interested 
community members could use it.

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Minor
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23891) Debian based Dockerfile

2018-04-12 Thread Anirudh Ramanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436092#comment-16436092
 ] 

Anirudh Ramanathan commented on SPARK-23891:


[~eje] has done a lot of research on these images and dependencies. PTAL

> Debian based Dockerfile
> ---
>
> Key: SPARK-23891
> URL: https://issues.apache.org/jira/browse/SPARK-23891
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Sercan Karaoglu
>Priority: Major
>
> Current dockerfile inherits from alpine linux which causes netty tcnative ssl 
> bindings to fail while loading which is the case when we use Google Cloud 
> Platforms Bigtable Client on top of spark cluster. would be better to have 
> another debian based dockerfile



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org