Hi,
Could anyone please share example of on how to use spark structured streaming
with kafka and write data into hive. Versions that I do have are
Spark 2.1 on CDH5.10
Kafka 0.9
Thanks,
Asmath
Sent from my iPhone
-
To
does Kinesis Connector for structured streaming auto-scales receivers if a
cluster is using dynamic allocation and auto-scaling?
Hey everyone,
I have a use case where I will be processing data in Spark and then writing
it back to MS SQL Server.
Is it possible to use bulk insert functionality and/or batch the writes
back to SQL?
I am using the DataFrame API to write the rows:
sqlContext.write.jdbc(...)
Thanks in advance
unsubscribe
DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè no
n'és el destinatari, si us plau, esborri'l i faci'ns-ho saber immediatament a
la següent adreça: le...@eurecat.org Si el destinatari d'aquest missatge no
consent la
If you have the temp view name (table, for example), couldn't you do
something like this?
val dfWithColumn=spark.sql("select *, as new_column from
table")
Thanks,
Subhash
On Thu, Feb 1, 2018 at 11:18 AM, kant kodali wrote:
> Hi,
>
> Are you talking about df.withColumn() ?
I am setting up a spark 2.2.1 cluster, however, when I bring up the master and workers (both on spark 2.2.1) I get this error. I tried spark 2.2.0 and get the same error. It works fine on spark 2.0.2. Have you seen this before, any idea what's wrong?
I found this, but it's in a different
Hi,
Are you talking about df.withColumn() ? If so, thats not what I meant. I
meant creating a new column using raw sql. otherwords say I dont have a
dataframe I only have the view name from df.createOrReplaceView("table") so
I can do things like "select * from table" so in a similar fashion I
unsubscribe
Hi TD:
Here is the udpated code with explain and full stack trace.
Please let me know what could be the issue and what to look for in the explain
output.
Updated code:
import scala.collection.immutableimport org.apache.spark.sql.functions._import
org.joda.time._import
You can use persist() or cache() operation on DataFrame.
On Tue, Dec 26, 2017 at 4:02 PM Shu Li Zheng wrote:
> Hi all,
>
> I have a scenario like this:
>
> val df = dataframe.map().filter()
> // agg 1
> val query1 = df.sum.writeStream.start
> // agg 2
> val query2 =
Sure, use withColumn()...
jg
> On Feb 1, 2018, at 05:50, kant kodali wrote:
>
> Hi All,
>
> Is there any way to create a new timeuuid column of a existing dataframe
> using raw sql? you can assume that there is a timeuuid udf function if that
> helps.
>
> Thanks!
Hi All,
Is there any way to create a new timeuuid column of a existing dataframe
using raw sql? you can assume that there is a timeuuid udf function if that
helps.
Thanks!
12 matches
Mail list logo