date:20170113

Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-13 Thread shyla deshpande

Hello, My spark streaming app that reads kafka topics and prints the DStream works fine on my laptop, but on AWS cluster it produces no output and no errors. Please help me debug. I am using Spark 2.0.2 and kafka-0-10 Thanks The following is the output of the spark streaming app... 17/01/14

Debugging a PythonException with no details

2017-01-13 Thread Nicholas Chammas

I’m looking for tips on how to debug a PythonException that’s very sparse on details. The full exception is below, but the only interesting bits appear to be the following lines: org.apache.spark.api.python.PythonException: ... py4j.protocol.Py4JError: An error occurred while calling None.org.apac

Re: Spark SQL DataFrame to Kafka Topic

2017-01-13 Thread Tathagata Das

Structured Streaming has a foreach sink, where you can essentially do what you want with your data. Its easy to create a Kafka producer, and write the data out to kafka. http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach On Fri, Jan 13, 2017 at 8:28 AM, K

filter rows based on all columns

2017-01-13 Thread Xiaomeng Wan

I need to filter out outliers from a dataframe on all columns. I can manually list all columns like: df.filter(x=>math.abs(x.get(0).toString().toDouble-means(0))<=3*stddevs(0)) .filter(x=>math.abs(x.get(1).toString().toDouble-means(1))<=3*stddevs(1 )) ... But I want to turn it into a ge

Re: [Spark SQL - Scala] TestHive not working in Spark 2

2017-01-13 Thread Xin Wu

In terms of the nullPointerException, i think it is bug. since the test data directories might be moved already. so it failed to load the test data to create the test tables. You may create a jira for this. On Fri, Jan 13, 2017 at 11:44 AM, Xin Wu wrote: > If you are using spark-shell, you have

Re: [Spark SQL - Scala] TestHive not working in Spark 2

2017-01-13 Thread Xin Wu

If you are using spark-shell, you have instance "sc" as the SparkContext initialized already. If you are writing your own application, you need to create a SparkSession, which comes with the SparkContext. So you can reference it like sparkSession.sparkContext. In terms of creating a table from Dat

RE: [Spark SQL - Scala] TestHive not working in Spark 2

2017-01-13 Thread Nicolas Tallineau

But it forces you to create your own SparkContext, which I’d rather not do. Also it doesn’t seem to allow me to directly create a table from a DataFrame, as follow: TestHive.createDataFrame[MyType](rows).write.saveAsTable("a_table") From: Xin Wu [mailto:xwu0...@gmail.com] Sent: 13 janvier 2017

Re: [Spark SQL - Scala] TestHive not working in Spark 2

2017-01-13 Thread Xin Wu

I used the following: val testHive = new org.apache.spark.sql.hive.test.TestHiveContext(sc, *false*) val hiveClient = testHive.sessionState.metadataHive hiveClient.runSqlHive(“….”) On Fri, Jan 13, 2017 at 6:40 AM, Nicolas Tallineau < nicolas.tallin...@ubisoft.com> wrote: > I get a nullPointerE

Re: Spark SQL DataFrame to Kafka Topic

2017-01-13 Thread Koert Kuipers

how do you do this with structured streaming? i see no mention of writing to kafka On Fri, Jan 13, 2017 at 10:30 AM, Peyman Mohajerian wrote: > Yes, it is called Structured Streaming: https://docs. > databricks.com/_static/notebooks/structured-streaming-kafka.html > http://spark.apache.org/docs/

Re: Schema evolution in tables

2017-01-13 Thread sim

There is not automated solution right now. You have to issue manual ALTER TABLE commands, which works for adding top-level columns but gets tricky if you are adding a field in a deeply nested struct. Hopefully, the issue will be fixed in 2.2 because work has started on https://issues.apache.org/ji

Re: Running a spark code using submit job in google cloud platform

2017-01-13 Thread Anahita Talebi

Hello, Thanks a lot Dinko. Yes, now it is working perfectly. Cheers, Anahita On Fri, Jan 13, 2017 at 2:19 PM, Dinko Srkoč wrote: > On 13 January 2017 at 13:55, Anahita Talebi > wrote: > > Hi, > > > > Thanks for your answer. > > > > I have chose "Spark" in the "job type". There is not any opt

Re: Spark SQL DataFrame to Kafka Topic

2017-01-13 Thread Peyman Mohajerian

Yes, it is called Structured Streaming: https://docs.databricks.com/_static/notebooks/structured-streaming-kafka.html http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html On Fri, Jan 13, 2017 at 3:32 AM, Senthil Kumar wrote: > Hi Team , > > Sorry if this question

[Spark SQL - Scala] TestHive not working in Spark 2

2017-01-13 Thread Nicolas Tallineau

I get a nullPointerException as soon as I try to execute a TestHive.sql(...) statement since migrating to Spark 2 because it's trying to load non existing "test tables". I couldn't find a way to switch to false the loadTestTables variable. Caused by: sbt.ForkMain$ForkError: java.lang.NullPointe

Re: Running a spark code using submit job in google cloud platform

2017-01-13 Thread Dinko Srkoč

On 13 January 2017 at 13:55, Anahita Talebi wrote: > Hi, > > Thanks for your answer. > > I have chose "Spark" in the "job type". There is not any option where we can > choose the version. How I can choose different version? There's "Preemptible workers, bucket, network, version, initialization, &

Re: Running a spark code using submit job in google cloud platform

2017-01-13 Thread Anahita Talebi

Hi, Thanks for your answer. I have chose "Spark" in the "job type". There is not any option where we can choose the version. How I can choose different version? Thanks, Anahita On Thu, Jan 12, 2017 at 6:39 PM, A Shaikh wrote: > You may have tested this code on Spark version on your local mac

Re: Spark in docker over EC2

2017-01-13 Thread Teng Qiu

Hi, you can take a look at this project, it is a distributed HA Spark cluster for AWS environment using Docker, we put the spark ec2 instances in an ELB, and using this code snippet to get the instance IPs: https://github.com/zalando-incubator/spark-appliance/blob/master/utils.py#L49-L56 Dockerfi

Spark SQL DataFrame to Kafka Topic

2017-01-13 Thread Senthil Kumar

Hi Team , Sorry if this question already asked in this forum.. Can we ingest data to Apache Kafka Topic from Spark SQL DataFrame ?? Here is my Code which Reads Parquet File : *val sqlContext = new org.apache.spark.sql.SQLContext(sc);* *val df = sqlContext.read.parquet("/temp/*.parquet

Spark streaming app that processes Kafka DStreams produces no output and no error

Debugging a PythonException with no details

Re: Spark SQL DataFrame to Kafka Topic

filter rows based on all columns

Re: [Spark SQL - Scala] TestHive not working in Spark 2

Re: [Spark SQL - Scala] TestHive not working in Spark 2

RE: [Spark SQL - Scala] TestHive not working in Spark 2

Re: [Spark SQL - Scala] TestHive not working in Spark 2

Re: Spark SQL DataFrame to Kafka Topic

Re: Schema evolution in tables

Re: Running a spark code using submit job in google cloud platform

Re: Spark SQL DataFrame to Kafka Topic

[Spark SQL - Scala] TestHive not working in Spark 2

Re: Running a spark code using submit job in google cloud platform

Re: Running a spark code using submit job in google cloud platform

Re: Spark in docker over EC2

Spark SQL DataFrame to Kafka Topic

17 matches

Site Navigation

Mail list logo

Footer information