Spark + CDB (Cockroach DB) support...

2018-06-15 Thread Muthu Jayakumar
Hello there, I am trying to check to see CDB is available for Apache Spark. I could currently use CDB using Postgres driver. But I would like to check to see if there are any specialized drivers that I can use which optimizes for predicate-push-down and other optimizations pertaining to

Re: [structured-streaming][parquet] readStream files order in Parquet

2018-06-15 Thread Tathagata Das
The files are processed in the order the file last modified timestamp. The path and partitioning scheme are not used for ordering. On Thu, Jun 14, 2018 at 6:59 AM, karthikjay wrote: > My parquet files are first partitioned by environment and then by date > like: > > env=testing/ >

Re: Issue upgrading to Spark 2.3.1 (Maintenance Release)

2018-06-15 Thread Vamshi Talla
Akash, Are you able to run your code in pyspark shell with no issues? Best Regards, Vamshi T From: Hyukjin Kwon Sent: Friday, June 15, 2018 10:18 AM To: Marcelo Vanzin Cc: aakash.spark@gmail.com; user @spark Subject: Re: Issue upgrading to Spark 2.3.1

unsubscribe

2018-06-15 Thread ARAVIND ARUMUGHAM Sethurathnam
unsubscribe -- Wealth is not money. Wealth is relationships with people.

unsubscribe

2018-06-15 Thread 刘崇光
unsubscribe

Re: Issue upgrading to Spark 2.3.1 (Maintenance Release)

2018-06-15 Thread Hyukjin Kwon
I use PyCharm. Mind if I ask to elaborate what you did step by step? 2018년 6월 16일 (토) 오전 12:11, Marcelo Vanzin 님이 작성: > I'm not familiar with PyCharm. But if you can run "pyspark" from the > command line and not hit this, then this might be an issue with > PyCharm or your environment - e.g.

Re: Issue upgrading to Spark 2.3.1 (Maintenance Release)

2018-06-15 Thread Hyukjin Kwon
I use PyCharm. Mind if I ask to elaborate what you did step by step? 2018년 6월 16일 (토) 오전 12:11, Marcelo Vanzin 님이 작성: > I'm not familiar with PyCharm. But if you can run "pyspark" from the > command line and not hit this, then this might be an issue with > PyCharm or your environment - e.g.

Re: Issue upgrading to Spark 2.3.1 (Maintenance Release)

2018-06-15 Thread Marcelo Vanzin
I'm not familiar with PyCharm. But if you can run "pyspark" from the command line and not hit this, then this might be an issue with PyCharm or your environment - e.g. having an old version of the pyspark code around, or maybe PyCharm itself might need to be updated. On Thu, Jun 14, 2018 at 10:01

Understanding Event Timeline of Spark UI

2018-06-15 Thread Aakash Basu
Hi, I've a job running which shows the Event Timeline as follows, I am trying to guess the gaps between these single lines, they seem to be parallel but not immediately sequential with other stages. Any other insight from this, and what is the cluster doing during these gaps? Thanks, Aakash.