But it forces you to create your own SparkContext, which I’d rather not do.
Also it doesn’t seem to allow me to directly create a table from a DataFrame,
as follow:
TestHive.createDataFrame[MyType](rows).write.saveAsTable("a_table")
From: Xin Wu [mailto:xwu0...@gmail.com]
Sent: 13 janvier 2017
I need to filter out outliers from a dataframe on all columns. I can
manually list all columns like:
df.filter(x=>math.abs(x.get(0).toString().toDouble-means(0))<=3*stddevs(0))
.filter(x=>math.abs(x.get(1).toString().toDouble-means(1))<=3*stddevs(1
))
...
But I want to turn it into a
In terms of the nullPointerException, i think it is bug. since the test
data directories might be moved already. so it failed to load the test data
to create the test tables. You may create a jira for this.
On Fri, Jan 13, 2017 at 11:44 AM, Xin Wu wrote:
> If you are using
If you are using spark-shell, you have instance "sc" as the SparkContext
initialized already. If you are writing your own application, you need to
create a SparkSession, which comes with the SparkContext. So you can
reference it like sparkSession.sparkContext.
In terms of creating a table from
I used the following:
val testHive = new org.apache.spark.sql.hive.test.TestHiveContext(sc,
*false*)
val hiveClient = testHive.sessionState.metadataHive
hiveClient.runSqlHive(“….”)
On Fri, Jan 13, 2017 at 6:40 AM, Nicolas Tallineau <
nicolas.tallin...@ubisoft.com> wrote:
> I get a
I’m looking for tips on how to debug a PythonException that’s very sparse
on details. The full exception is below, but the only interesting bits
appear to be the following lines:
org.apache.spark.api.python.PythonException:
...
py4j.protocol.Py4JError: An error occurred while calling
Structured Streaming has a foreach sink, where you can essentially do what
you want with your data. Its easy to create a Kafka producer, and write the
data out to kafka.
http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach
On Fri, Jan 13, 2017 at 8:28 AM,
how do you do this with structured streaming? i see no mention of writing
to kafka
On Fri, Jan 13, 2017 at 10:30 AM, Peyman Mohajerian
wrote:
> Yes, it is called Structured Streaming: https://docs.
> databricks.com/_static/notebooks/structured-streaming-kafka.html
>
Hello,
My spark streaming app that reads kafka topics and prints the DStream works
fine on my laptop, but on AWS cluster it produces no output and no errors.
Please help me debug.
I am using Spark 2.0.2 and kafka-0-10
Thanks
The following is the output of the spark streaming app...
17/01/14
I get a nullPointerException as soon as I try to execute a TestHive.sql(...)
statement since migrating to Spark 2 because it's trying to load non existing
"test tables". I couldn't find a way to switch to false the loadTestTables
variable.
Caused by: sbt.ForkMain$ForkError:
Hi Team ,
Sorry if this question already asked in this forum..
Can we ingest data to Apache Kafka Topic from Spark SQL DataFrame ??
Here is my Code which Reads Parquet File :
*val sqlContext = new org.apache.spark.sql.SQLContext(sc);*
*val df =
Hi,
Thanks for your answer.
I have chose "Spark" in the "job type". There is not any option where we
can choose the version. How I can choose different version?
Thanks,
Anahita
On Thu, Jan 12, 2017 at 6:39 PM, A Shaikh wrote:
> You may have tested this code on Spark
On 13 January 2017 at 13:55, Anahita Talebi wrote:
> Hi,
>
> Thanks for your answer.
>
> I have chose "Spark" in the "job type". There is not any option where we can
> choose the version. How I can choose different version?
There's "Preemptible workers, bucket,
Hi, you can take a look at this project, it is a distributed HA Spark
cluster for AWS environment using Docker, we put the spark ec2
instances in an ELB, and using this code snippet to get the instance
IPs:
https://github.com/zalando-incubator/spark-appliance/blob/master/utils.py#L49-L56
Yes, it is called Structured Streaming:
https://docs.databricks.com/_static/notebooks/structured-streaming-kafka.html
http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
On Fri, Jan 13, 2017 at 3:32 AM, Senthil Kumar
wrote:
> Hi Team ,
>
>
Hello,
Thanks a lot Dinko.
Yes, now it is working perfectly.
Cheers,
Anahita
On Fri, Jan 13, 2017 at 2:19 PM, Dinko Srkoč wrote:
> On 13 January 2017 at 13:55, Anahita Talebi
> wrote:
> > Hi,
> >
> > Thanks for your answer.
> >
> > I have
There is not automated solution right now. You have to issue manual ALTER
TABLE commands, which works for adding top-level columns but gets tricky if
you are adding a field in a deeply nested struct.
Hopefully, the issue will be fixed in 2.2 because work has started on
17 matches
Mail list logo