hi, I've being using StreamingContext.stop(true,true) ,trying to stop my
application gracefully,which means it can promise all received data will be
processed before the whole application terminated. It dose works ,but I also
noticed that it will also lead to extra time just waiting for empty rd
Hi,
With regard to your statement below
".technology choices are agnostic to use cases according to you"
If I may say, I do not think that was the message implied. What was said
was that in addition to "best technology fit" there are other factors
"equally important" that need to be consider
So Mich and rest,
technology choices are agnostic to use cases according to you? This is
interesting, really interesting. Perhaps I stand corrected.
Regards,
Gourav
On Sun, Oct 11, 2020 at 5:00 PM Mich Talebzadeh
wrote:
> if we take Spark and its massive parallel processing and in-memory
> cac
if we take Spark and its massive parallel processing and in-memory
cache away, then one can argue anything can do the "ETL" job. just write
some Java/Scala/SQL/Perl/python to read data and write to from one DB to
another often using JDBC connections. However, we all concur that may not
be good enou
But when you have fairly large volume of data that is where spark comes in
the party. And I assume the requirement of using spark is already
established in the original qs and the discussion is to use python vs
scala/java.
On Sun, 11 Oct 2020 at 10:51 pm, Sasha Kacanski wrote:
> If org has folks
Thanks Ayan.
I am not qualified to answer your first point. However, my experience with
Spark with Scala or Spark with Python agrees with your assertion that use
cases do not come into it. Most DEV/OPS work dealing with ETL are provided
by service companies that have workforce very familiar with J