Some more info
val lines = ssc.socketStream() // works
val lines = ssc.receiverStream(new NiFiReceiver(conf,
StorageLevel.MEMORY_AND_DISK_SER_2)) // does not work
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
https://www.facebook.com/allan.tuuring
+372 51 48 780
On 15/09/2017 21:5
Hi
I tested |spark.streaming.receiver.maxRate and
||spark.streaming.backpressure.enabled settings using socketStream and
it works.|
|But if I am using nifi-spark-receiver
(https://mvnrepository.com/artifact/org.apache.nifi/nifi-spark-receiver)
then it does not using |
||spark.streaming.rec
Hi -
Wanted to understand if spark sql has GRANT and REVOKE statements available?
Is anyone working on making that available?
Regards,
Arun
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Well, the dataframes make it easier to work on some columns of the data only
and to store results in new columns, removing the need to zip it all back
together and thus to preserve order.
On 2017-09-05 14:04 CEST, mehmet.su...@gmail.com wrote:
Hi Johan,
DataFrames are building on top of RDDs,
Hi,
I am using sparkR randomForest function and running into
java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE issue.
Looks like I am running into this issue
https://issues.apache.org/jira/browse/SPARK-1476, I used
spark.default.parallelism=1000 but still facing the same issue.
Hey Daniel, not sure this will help, but... I had a similar need where i wanted
the content of a dataframe to become a "cell" or a row in the parent dataframe.
I grouped by the child dataframe, then collect it as a list in the parent
dataframe after a join operation. As I said, not sure it match
Hi Johan,
DataFrames are building on top of RDDs, not sure if the ordering
issues are different there. Maybe you could create minimally large
enough simulated data and example series of transformations as an
example to experiment on.
Best,
-m
Mehmet Süzen, MSc, PhD
| PRIVILEGED AND CONFIDENTIAL
Hi guys,
I'm having trouble implementing this scenario:
I have a column with a typical entry being : ['apple', 'orange', 'apple',
'pear', 'pear']
I need to use a StringIndexer to transform this to : [0, 2, 0, 1, 1]
I'm attempting to do this but because of the nested operation on another
RDD I g
Thanks all for your answers. After reading the provided links I am still
uncertain of the details of what I'd need to do to get my calculations right
with RDDs. However I discovered DataFrames and Pipelines on the "ML" side of
the libs and I think they'll be better suited to my needs.
Best,
Joh