Restart streaming query spark 2.1 structured streaming

2017-08-15 Thread purna pradeep
Hi, > > I'm trying to restart a streaming query to refresh cached data frame > > Where and how should I restart streaming query > val sparkSes = SparkSession .builder .config("spark.master", "local") .appName("StreamingCahcePoc") .getOrCreate() import

Re: Restart streaming query spark 2.1 structured streaming

2017-08-15 Thread purna pradeep
ion and > restart query > activeQuery.stop() > activeQuery = startQuery() >} > >activeQuery.awaitTermination(100) // wait for 100 ms. >// if there is any error it will throw exception and quit the loop >// otherwise it will keep checking the conditi

Re: Restart streaming query spark 2.1 structured streaming

2017-08-15 Thread purna pradeep
: > See > https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#recovering-from-failures-with-checkpointing > > Though I think that this currently doesn't work with the console sink. > > On Tue, Aug 15, 2017 at 9:40 AM, purna pradeep <purn

Re: Restart streaming query spark 2.1 structured streaming

2017-08-15 Thread purna pradeep
rigger/batch after the asynchronous > unpersist+persist will probably take longer as it has to reload the data. > > > On Tue, Aug 15, 2017 at 2:29 PM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> Thanks tathagata das actually I'm planning to something like this >&g

Re: Restart streaming query spark 2.1 structured streaming

2017-08-16 Thread purna pradeep
And also is query.stop() is graceful stop operation?what happens to already received data will it be processed ? On Tue, Aug 15, 2017 at 7:21 PM purna pradeep <purna2prad...@gmail.com> wrote: > Ok thanks > > Few more > > 1.when I looked into the documentation it

StreamingQueryListner spark structered Streaming

2017-08-09 Thread purna pradeep
Im working on structered streaming application wherein im reading from Kafka as stream and for each batch of streams i need to perform S3 lookup file (which is nearly 200gb) to fetch some attributes .So im using df.persist() (basically caching the lookup) but i need to refresh the dataframe as the

use WithColumn with external function in a java jar

2017-08-28 Thread purna pradeep
I have data in a DataFrame with below columns 1)Fileformat is csv 2)All below column datatypes are String employeeid,pexpense,cexpense Now I need to create a new DataFrame which has new column called `expense`, which is calculated based on columns `pexpense`, `cexpense`. The tricky part is

Re: use WithColumn with external function in a java jar

2017-08-29 Thread purna pradeep
va().calculateExpense(pexpense.toDouble, > cexpense.toDouble)) > > > > > > On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> I have data in a DataFrame with below columns >> >> 1)Fileformat is csv >> 2)All below

Select entire row based on a logic applied on 2 columns across multiple rows

2017-08-29 Thread purna pradeep
-- Forwarded message - From: Mamillapalli, Purna Pradeep <purnapradeep.mamillapa...@capitalone.com> Date: Tue, Aug 29, 2017 at 8:08 PM Subject: Spark question To: purna pradeep <purna2prad...@gmail.com> Below is the input Dataframe(In real this is a very lar

Re: Select entire row based on a logic applied on 2 columns across multiple rows

2017-08-29 Thread purna pradeep
Please click on unnamed text/html link for better view On Tue, Aug 29, 2017 at 8:11 PM purna pradeep <purna2prad...@gmail.com> wrote: > > -- Forwarded message - > From: Mamillapalli, Purna Pradeep < > purnapradeep.mamillapa...@capitalone.com> > Date: T

Re: Select entire row based on a logic applied on 2 columns across multiple rows

2017-08-30 Thread purna pradeep
ested but meaby with @ayan sql > > spark.sql("select *, row_number(), last_value(income) over (partition by > id order by income_age_ts desc) r from t") > > > On Tue, Aug 29, 2017 at 11:30 PM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> @ay

Re: Select entire row based on a logic applied on 2 columns across multiple rows

2017-08-29 Thread purna pradeep
3| ES|101| 19000| 4/20/17| 1| > | 4/20/12| DS|102| 13000| 5/9/17| 1| > +++---+--+-+---+ > > This should be better because it uses all in-built optimizations in Spark. > > Best > Ayan > > On Wed, Aug 30, 2017 at 11:06 AM, purna pradeep <

Spark http: Not showing completed apps

2017-11-08 Thread purna pradeep
Hi, I'm using spark standalone in aws ec2 .And I'm using spark rest API http::8080/Json to get completed apps but the Json completed apps as empty array though the job ran successfully.

Oozie with spark 2.3 in Kubernetes

2018-05-11 Thread purna pradeep
Hello, Would like to know if anyone tried oozie with spark 2.3 actions on Kubernetes for scheduling spark jobs . Thanks, Purna

Spark driver pod eviction Kubernetes

2018-05-22 Thread purna pradeep
Hi, What would be the recommended approach to wait for spark driver pod to complete the currently running job before it gets evicted to new nodes while maintenance on the current node is goingon (kernel upgrade,hardware maintenance etc..) using drain command I don’t think I can use

Spark driver pod garbage collection

2018-05-23 Thread purna pradeep
Hello, Currently I observe dead pods are not getting garbage collected (aka spark driver pods which have completed execution). So pods could sit in the namespace for weeks potentially. This makes listing, parsing, and reading pods slower and well as having junk sit on the cluster. I believe

Spark 2.3 driver pod stuck in Running state — Kubernetes

2018-06-08 Thread purna pradeep
Hello, When I run spark-submit on k8s cluster I’m Seeing driver pod stuck in Running state and when I pulled driver pod logs I’m able to see below log I do understand that this warning might be because of lack of cpu/ Memory , but I expect driver pod be in “Pending” state rather than “ Running”

Spark 2.3 error on Kubernetes

2018-05-29 Thread purna pradeep
Hello, I’m getting below error when I spark-submit a Spark 2.3 app on Kubernetes *v1.8.3* , some of the executor pods were killed with below error as soon as they come up Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at

Re: Spark 2.3 error on Kubernetes

2018-05-29 Thread purna pradeep
t/crashloop due to > lack of resource. > > On Tue, May 29, 2018 at 3:18 PM, purna pradeep > wrote: > >> Hello, >> >> I’m getting below error when I spark-submit a Spark 2.3 app on >> Kubernetes *v1.8.3* , some of the executor pods were killed with below >

spark partitionBy with partitioned column in json output

2018-06-04 Thread purna pradeep
im reading below json in spark {"bucket": "B01", "actionType": "A1", "preaction": "NULL", "postaction": "NULL"} {"bucket": "B02", "actionType": "A2", "preaction": "NULL", "postaction": "NULL"} {"bucket": "B03", "actionType": "A3", "preaction": "NULL", "postaction": "NULL"} val

Re: Unsubscribe

2018-02-11 Thread purna pradeep
Unsubscribe

Unsubscribe

2018-02-26 Thread purna pradeep
- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Executor lost for unknown reasons error Spark 2.3 on kubernetes

2018-07-30 Thread purna pradeep
Hello, I’m getting below error in spark driver pod logs and executor pods are getting killed midway through while the job is running and even driver pod Terminated with below intermittent error ,this happens if I run multiple jobs in parallel. Not able to see executor logs as executor pods

Re: Executor lost for unknown reasons error Spark 2.3 on kubernetes

2018-07-31 Thread purna pradeep
va:858) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) at java.lang.Thread.run(Thread.java: On Tue, Jul 31, 2018 at 8:32 AM purna pradeep wrote: > > Hello, >> >> >> >> I’m getting below error in spa

Executor lost for unknown reasons error Spark 2.3 on kubernetes

2018-07-31 Thread purna pradeep
> Hello, > > > > I’m getting below error in spark driver pod logs and executor pods are > getting killed midway through while the job is running and even driver pod > Terminated with below intermittent error ,this happens if I run multiple > jobs in parallel. > > > > Not able to see executor logs

spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

2018-08-15 Thread purna pradeep
im running Spark 2.3 job on kubernetes cluster kubectl version Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z", GoVersion:"go1.9.4", Compiler:"gc",

Re: spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

2018-08-16 Thread purna pradeep
Hello, im running Spark 2.3 job on kubernetes cluster > > kubectl version > > Client Version: version.Info{Major:"1", Minor:"9", > GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", > GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z", > GoVersion:"go1.9.4",

Re: spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

2018-08-17 Thread purna pradeep
Resurfacing The question to get more attention Hello, > > im running Spark 2.3 job on kubernetes cluster >> >> kubectl version >> >> Client Version: version.Info{Major:"1", Minor:"9", >> GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", >> GitTreeState:"clean",

Spark 2.3 Kubernetes error

2018-07-05 Thread purna pradeep
Hello, When I’m trying to set below options to spark-submit command on k8s Master getting below error in spark-driver pod logs --conf spark.executor.extraJavaOptions=" -Dhttps.proxyHost=myhost -Dhttps.proxyPort=8099 -Dhttp.useproxy=true -Dhttps.protocols=TLSv1.2" \ --conf

Spark 2.3 Kubernetes error

2018-07-06 Thread purna pradeep
> Hello, > > > > When I’m trying to set below options to spark-submit command on k8s Master > getting below error in spark-driver pod logs > > > > --conf spark.executor.extraJavaOptions=" -Dhttps.proxyHost=myhost > -Dhttps.proxyPort=8099 -Dhttp.useproxy=true -Dhttps.protocols=TLSv1.2" \ > > --conf

handling Remote dependencies for spark-submit in spark 2.3 with kubernetes

2018-03-08 Thread purna pradeep
Im trying to run spark-submit to kubernetes cluster with spark 2.3 docker container image The challenge im facing is application have a mainapplication.jar and other dependency files & jars which are located in Remote location like AWS s3 ,but as per spark 2.3 documentation there is something

Spark 2.3 submit on Kubernetes error

2018-03-11 Thread purna pradeep
Getting below errors when I’m trying to run spark-submit on k8 cluster *Error 1*:This looks like a warning it doesn’t interrupt the app running inside executor pod but keeps on getting this warning *2018-03-09 11:15:21 WARN WatchConnectionManager:192 - Exec Failure* *

Re: Spark 2.3 submit on Kubernetes error

2018-03-12 Thread purna pradeep
ed in your cluster? This issue > https://github.com/apache-spark-on-k8s/spark/issues/558 might help. > > > On Sun, Mar 11, 2018 at 5:01 PM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> Getting below errors when I’m trying to run spark-submit on k8 cluster >>

Re: Scala program to spark-submit on k8 cluster

2018-04-04 Thread purna pradeep
yes “REST application that submits a Spark job to a k8s cluster by running spark-submit programmatically” and also would like to expose as a Kubernetes service so that clients can access as any other Rest api On Wed, Apr 4, 2018 at 12:25 PM Yinan Li wrote: > Hi Kittu, > >

Unsubscribe

2018-03-28 Thread purna pradeep
- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

unsubscribe

2018-03-28 Thread purna pradeep
- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

unsubscribe

2018-04-02 Thread purna pradeep
unsubscribe

Rest API for Spark2.3 submit on kubernetes(version 1.8.*) cluster

2018-03-20 Thread purna pradeep
Im using kubernetes cluster on AWS to run spark jobs ,im using spark 2.3 ,now i want to run spark-submit from AWS lambda function to k8s master,would like to know if there is any REST interface to run Spark submit on k8s Master

Re: Rest API for Spark2.3 submit on kubernetes(version 1.8.*) cluster

2018-03-21 Thread purna pradeep
kApplication CRD objects and > automatically submits the applications to run on a Kubernetes cluster. > > Yinan > > On Tue, Mar 20, 2018 at 7:47 PM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> Im using kubernetes cluster on AWS to run spark jobs ,im using spa

Unsubscribe

2018-02-26 Thread purna pradeep
- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Unsubscribe

2018-02-27 Thread purna pradeep
- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: [ANNOUNCE] Announcing Apache Spark 2.4.0

2018-11-09 Thread purna pradeep
Thanks this is a great news Can you please lemme if dynamic resource allocation is available in spark 2.4? I’m using spark 2.3.2 on Kubernetes, do I still need to provide executor memory options as part of spark submit command or spark will manage required executor memory based on the spark job

Spark 2.3.1: k8s driver pods stuck in Initializing state

2018-09-26 Thread purna pradeep
Hello , We're running spark 2.3.1 on kubernetes v1.11.0 and our driver pods from k8s are getting stuck in initializing state like so: NAME READY STATUS RESTARTS AGE my-pod-fd79926b819d3b34b05250e23347d0e7-driver 0/1 Init:0/1 0 18h And

Dynamic executor scaling spark/Kubernetes

2019-04-16 Thread purna pradeep
Hello, Is Kubernetes Dynamic executor scaling for spark is available in latest release of spark I mean scaling the executors based on the work load vs preallocating number of executors for a spark job Thanks, Purna

Executor not getting added SparkUI & Spark Eventlog in deploymode:cluster

2017-11-14 Thread Mamillapalli, Purna Pradeep
Hi all, Im performing spark submit using Spark rest api POST operation on 6066 port with below config > Launch Command: > "/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.141-1.b16.el7_3.x86_64/jre/bin/java" > "-cp" "/usr/local/spark/conf/:/usr/local/spark/jars/*" "-Xmx4096M" >

Spark 2.3 error on kubernetes

2018-05-29 Thread Mamillapalli, Purna Pradeep
Hello, I’m getting below intermittent error when I spark-submit a Spark 2.3 app on Kubernetes v1.8.3 , some of the executor pods were killed with below error as soon as they come up Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at

Spark 2.3 error on kubernetes

2018-05-29 Thread Mamillapalli, Purna Pradeep
Hello, I’m getting below intermittent error when I spark-submit a Spark 2.3 app on Kubernetes v1.8.3 , some of the executor pods were killed with below error as soon as they come up Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at

Executor lost for unknown reasons error Spark 2.3 on kubernetes

2018-07-30 Thread Mamillapalli, Purna Pradeep
Hello, I’m getting below error in spark driver pod logs and executor pods are getting killed midway through while the job is running and even driver pod Terminated with below intermittent error ,this happens if I run multiple jobs in parallel. Not able to see executor logs as executor pods

Spark 2.3 Kubernetes error

2018-07-05 Thread Mamillapalli, Purna Pradeep
Hello, When I’m trying to set below options to spark-submit command on k8s Master getting below error in spark-driver pod logs --conf spark.executor.extraJavaOptions=" -Dhttps.proxyHost=myhost -Dhttps.proxyPort=8099 -Dhttp.useproxy=true -Dhttps.protocols=TLSv1.2" \ --conf

Spark 2.3.1: k8s driver pods stuck in Initializing state

2018-09-26 Thread Purna Pradeep Mamillapalli
We're running spark 2.3.1 on kubernetes v1.11.0 and our driver pods from k8s are getting stuck in initializing state like so: NAME READY STATUS RESTARTS AGE my-pod-fd79926b819d3b34b05250e23347d0e7-driver 0/1 Init:0/1 0 18h And from *kubectl