date:20180125

Re: Apache Spark - Custom structured streaming data source

2018-01-25 Thread Tathagata Das

Hello Mans, The streaming DataSource APIs are still evolving and are not public yet. Hence there is no official documentation. In fact, there is a new DataSourceV2 API (in Spark 2.3) that we are migrating towards. So at this point of time, it's hard to make any concrete suggestion. You can take a

Re: how to create a DataType Object using the String representation in Java using Spark 2.2.0?

2018-01-25 Thread Kurt Fehlhauer

Can you share your code and a sample of your data? WIthout seeing it, I can't give a definitive answer. I can offer some hints. If you have a column of strings you should either be able to create a new column casted to Integer. This can be accomplished two ways: df.withColumn("newColumn", df.curre

Re: how to create a DataType Object using the String representation in Java using Spark 2.2.0?

2018-01-25 Thread kant kodali

It seems like its hard to construct a DataType given its String literal representation. dataframe.types() return column names and its corresponding Types. for example say I have an integer column named "sum" doing dataframe.dtypes() would return "sum" and "IntegerType" but this string representat

Apache Spark - Custom structured streaming data source

2018-01-25 Thread M Singh

Hi: I am trying to create a custom structured streaming source and would like to know if there is any example or documentation on the steps involved. I've looked at the some methods available in the SparkSession but these are internal to the sql package: private[sql] def internalCreateDataFrame

Spark Standalone Mode, application runs, but executor is killed

2018-01-25 Thread Chandu

Hi, I tried my question @ stackoverlfow.com ( https://stackoverflow.com/questions/48445145/spark-standalone-mode-application-runs-but-executor-is-killed-with-exitstatus), yet to be answere, so thought I will tru the user group. I am new to Apache Spark and was trying to run the example Pi Calculat

how to create a DataType Object using the String representation in Java using Spark 2.2.0?

2018-01-25 Thread kant kodali

Hi All, I have a datatype "IntegerType" represented as a String and now I want to create DataType object out of that. I couldn't find in the DataType or DataTypes api on how to do that? Thanks!

Re: Get broadcast (set in one method) in another method

2018-01-25 Thread Gourav Sengupta

Hi, Just out of curiosity, in what sort of programming or designing paradigm does this way of solving things fit in? In case you are trying functional programming do you think that currying will help? Regards, Gourav Sengupta On Thu, Jan 25, 2018 at 8:04 PM, Margusja wrote: > Hi > > Maybe I a

Get broadcast (set in one method) in another method

2018-01-25 Thread Margusja

Hi Maybe I am overthinking. I’d like to set broadcast in object A method y and get it in object A method x. In example: object A { def main (args: Array[String]) { y() x() } def x() : Unit = { val a = bcA.value ... } def y(): String = { val bcA = sc.b

Custom build - missing images on MasterWebUI

2018-01-25 Thread Conconscious

Hi list, I'm trying to make a custom build of Spark, but in the end on Web UI there's no images. Some help please. Build from: git checkout v2.2.1 ./dev/make-distribution.sh --name custom-spark --pip --tgz -Psparkr -Phadoop-2.7 -Dhadoop.version=2.7.3 -Phive -Phive-thriftserver -Pmesos -Pyarn -

Re: Apache Hadoop and Spark

2018-01-25 Thread jamison.bennett

Hi Mutahir, I will try to answer some of your questions. Q1) Can we use Mapreduce and apache spark in the same cluster Yes. I run a cluster with both MapReduce2 and Spark and I use Yarn as the resource manager. Q2) is it mandatory to use GPUs for apache spark? No. My cluster has Spark and does n

Re: S3 token times out during data frame "write.csv"

2018-01-25 Thread Jean Georges Perrin

Are you writing from an Amazon instance or from a on premise install to S3? How many partitions are you writing from? Maybe you can try to “play” with repartitioning to see how it behaves? > On Jan 23, 2018, at 17:09, Vasyl Harasymiv wrote: > > It is about 400 million rows. S3 automatically chu

Kafka deserialization to Structured Streaming SQL - Encoders.bean result doesn't match itself?

2018-01-25 Thread Iain Cundy

Hi All I'm trying to move from MapWithState to Structured Streaming v2.2.1, but I've run into a problem. To convert from Kafka data with a binary (protobuf) value to SQL I'm taking the dataset from readStream and doing Dataset s = dataset.selectExpr("timestamp", "CAST(key as string)", "ETBi

Re: Apache Spark - Custom structured streaming data source

Re: how to create a DataType Object using the String representation in Java using Spark 2.2.0?

Re: how to create a DataType Object using the String representation in Java using Spark 2.2.0?

Apache Spark - Custom structured streaming data source

Spark Standalone Mode, application runs, but executor is killed

how to create a DataType Object using the String representation in Java using Spark 2.2.0?

Re: Get broadcast (set in one method) in another method

Get broadcast (set in one method) in another method

Custom build - missing images on MasterWebUI

Re: Apache Hadoop and Spark

Re: S3 token times out during data frame "write.csv"

Kafka deserialization to Structured Streaming SQL - Encoders.bean result doesn't match itself?

12 matches

Site Navigation

Mail list logo

Footer information