Use Spark extension points to implement row-level security

2018-08-16 Thread Richard Siebeling
Hi, I'd like to implement some kind of row-level security and am thinking of adding additional filters to the logical plan possibly using the Spark extensions. Would this be feasible, for example using the injectResolutionRule? thanks in advance, Richard

Re: java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-16 Thread Manu Zhang
You may try applying this PR https://github.com/apache/spark/pull/18416. On Fri, Aug 17, 2018 at 9:13 AM Venkat Dabri wrote: > We are using spark 2.2.0. Is it possible to bring the > ExpressionEncoder from 2.3.0 and related classes into my code base and > use them? I see the changes in Expressi

Re: java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-16 Thread Venkat Dabri
We are using spark 2.2.0. Is it possible to bring the ExpressionEncoder from 2.3.0 and related classes into my code base and use them? I see the changes in ExpressionEncoder between 2.3.0 and 2.2.0 is not much but there might be many other classes underneath that might have changed. On Thu, Aug 16

Re: Pass config file through spark-submit

2018-08-16 Thread yujhe.li
So can you read the file on executor side? I think the file passed by --files my.app.conf would be added under classpath, and you can use it directly. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsu

Re: Unable to see completed application in Spark 2 history web UI

2018-08-16 Thread Manu Zhang
Hi Fawze, Sorry but I'm not familiar with CM. Maybe you can look into the logs (or turn on DEBUG log). On Thu, Aug 16, 2018 at 3:05 PM Fawze Abujaber wrote: > Hi Manu, > > I'm using cloudera manager with single user mode and every process is > running with cloudera-scm user, the cloudera-scm is

something happened to MemoryStream after spark 2.3

2018-08-16 Thread Koert Kuipers
hi, we just started testing internally with spark 2.4 snapshots, and it seems our streaming tests are broken. i believe it has to do with MemoryStream. before we were able to create a MemoryStream, add data to it, convert it to a streaming unbounded DataFrame and use it repeatedly. by using it re

Re: [K8S] Spark initContainer custom bootstrap support for Spark master

2018-08-16 Thread Li Gao
Thanks! We will likely use the second option to customize the bootstrap. On Thu, Aug 16, 2018 at 10:04 AM Yinan Li wrote: > Yes, the init-container has been removed in the master branch. The > init-container was used in 2.3.x only for downloading remote dependencies, > which is now handled by ru

Re: [K8S] Spark initContainer custom bootstrap support for Spark master

2018-08-16 Thread Yinan Li
Yes, the init-container has been removed in the master branch. The init-container was used in 2.3.x only for downloading remote dependencies, which is now handled by running spark-submit in the driver. If you need to run custom bootstrap scripts using an init-container, the best option would be to

Re: java.lang.IndexOutOfBoundsException: len is negative - when data size increases

2018-08-16 Thread Vadim Semenov
one of the spills becomes bigger than 2GiB and can't be loaded fully (as arrays in Java can't have more than 2^32 values) > > org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillReader.loadNext(UnsafeSorterSpillReader.java:76) You can try increasing the number of partitions, so sp

java.lang.IndexOutOfBoundsException: len is negative - when data size increases

2018-08-16 Thread Deepak Sharma
Hi All, I am running spark based ETL in spark 1.6 and facing this weird issue. The same code with same properties/configuration runs fine in other environment E.g. PROD but never completes in CAT. The only change would be the size of data it is processing and that too be by 1-2 GB. This is the sta

[Spark Streaming] [ML]: Exception handling for the transform method of Spark ML pipeline model

2018-08-16 Thread sudododo
Hi, I'm implementing a Spark Streaming + ML application. The data is coming in a Kafka topic as json format. The Spark Kafka connector reads the data from the Kafka topic as DStream. After several preprocessing steps, the input DStream is transformed to a feature DStream which is fed into Spark ML

Pass config file through spark-submit

2018-08-16 Thread James Starks
I have a config file that exploits type safe config library located on the local file system, and want to submit that file through spark-submit so that spark program can read customized parameters. For instance, my.app { db { host = domain.cc port = 1234 db = dbname user = myus

Re: java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-16 Thread Manu Zhang
Hi, It's added since Spark 2.3.0. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SQLImplicits.scala#L180 Regards, Manu Zhang On Thu, Aug 16, 2018 at 9:59 AM V0lleyBallJunki3 wrote: > Hello, > I am using Spark 2.2.2 with Scala 2.11.8. I wrote a short

Re: spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

2018-08-16 Thread purna pradeep
Hello, im running Spark 2.3 job on kubernetes cluster > > kubectl version > > Client Version: version.Info{Major:"1", Minor:"9", > GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", > GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z", > GoVersion:"go1.9.4", Compile

Re: Structured streaming: Tried to fetch $offset but the returned record offset was ${record.offset}"

2018-08-16 Thread andreas . weise
On 2018/04/17 22:34:25, Cody Koeninger wrote: > Is this possibly related to the recent post on > https://issues.apache.org/jira/browse/SPARK-18057 ? > > On Mon, Apr 16, 2018 at 11:57 AM, ARAVIND SETHURATHNAM < > asethurath...@homeaway.com.invalid> wrote: > > > Hi, > > > > We have several str

Re: Unable to see completed application in Spark 2 history web UI

2018-08-16 Thread Fawze Abujaber
Hi Manu, I'm using cloudera manager with single user mode and every process is running with cloudera-scm user, the cloudera-scm is a super user and this is why i was confused how it worked in spark 1.6 and not in spark 2.3 On Thu, Aug 16, 2018 at 5:34 AM Manu Zhang wrote: > If you are able to