I have written this simple code to try streaming aggregation in spark 2.4.
Somehow, job keeps running but not returning any result. It returns me 3
columns JobType, Timestamp and TS if I remove groupby and count aggregation
function.
Would really appreciate any help.
val edgeDF = spark
1 - I am not sure how can I do what you suggest for #1 because I use the
entries in the initial df to build the query and then from it I get the second
df. Could you explain more?
2 - I also thought about doing what you consider in #2 , but if I am not
mistaken If I use regular Scala data s
Hi
The spark postgres JDBC reader is limited because it relies on basic
SELECT statements with fetchsize and crashes on large tables even if
multiple partitions are setup with lower/upper bounds.
I am about writing a new postgres JDBC reader based on "COPY TO STDOUT".
It would stream the data and
https://issues.apache.org/jira/browse/HIVE-13632
李斌松 于2018年12月29日周六 下午4:08写道:
> Hive has fixed this problem, which is not fixed in
> hive-exec-1.2.1.spark2.jar
>
> [image: image.png]
>
>