>From my experience, SQL is easy for the guys who already know SQL syntax. With
>the correct indexing SQL is also fast. But within programs dataframe is must
>faster and convenient for loading large data structure from the external.
De : "rajat kumar"
A : "user @spark"
Envoyé: lundi 6 Décembr
Hi Users,
Is there any use case when we need to use SQL vs Dataframe vs Dataset?
Is there any recommended approach or any advantage/performance gain over
others?
Thanks
Rajat
Hi Meikel,
Well the short answer is it is what it is, for one reason or other. if
someone else managed to make it work, then no doubt will be delighted to
hear it. Until then I prefer the built- in docker image.
Also by centralising this in the docker image, it will be available if a
node fails a
Hi Mich,
Thanks for your response. Yes -py-files options works. I also tested it.
The question is why the -archives option doesn't?
>From Jira I can see that it should be available since 3.1.0:
https://issues.apache.org/jira/browse/SPARK-33530
https://issues.apache.org/jira/browse/SPARK-33615
B