;> <https://stackoverflow.com/questions/40748687/python-api-rate-limiting-how-to-limit-api-calls-globally>)
>> and use this connection variable inside of the udf function in each
>> partition, invoking time.sleep. This will definitely introduce issues where
>> many par
-article
> <https://medium.com/geekculture/how-to-execute-a-rest-api-call-on-apache-spark-the-right-way-in-python-4367f2740e78>
> which discusses the issue you are facing, but does not discuss a solution
> for the same. Do check the comments also
>
> Regards,
> Varun
>
>
roduce issues
where many partitions are trying to invoke the api.
I found this medium-article which discusses the issue you are facing, but does
not discuss a solution for the same. Do check the comments also
Regards,Varun
On Sat, Aug 26, 2023 at 10:32 AM Harry Jamison
wrote:
I am using
dium.com/geekculture/how-to-execute-a-rest-api-call-on-apache-spark-the-right-way-in-python-4367f2740e78>
which discusses the issue you are facing, but does not discuss a solution
for the same. Do check the comments also
Regards,
Varun
On Sat, Aug 26, 2023 at 10:32 AM Harry Jamison
wrote:
&
I am using python 3.7 and Spark 2.4.7
I am not sure what the best way to do this is.
I have a dataframe with a url in one of the columns, and I want to download the
contents of that url and put it in a new column.
Can someone point me in the right direction on how to do this?I looked at the
UDFs
Thanks for the link Prashant.
Regards
Sachit
On Tue, 5 Jan 2021, 15:08 Prashant Sharma, wrote:
> A lot of developers may have already moved to 3.0.x, FYI 3.1.0 is just
> around the corner hopefully(in a few days) and has a lot of improvements to
> spark on K8s, including it will be
A lot of developers may have already moved to 3.0.x, FYI 3.1.0 is just
around the corner hopefully(in a few days) and has a lot of improvements to
spark on K8s, including it will be transitioning from experimental to GA in
this release.
See: https://issues.apache.org/jira/browse/SPARK-33005
Hi Users,
Could you please tell which Spark version have you used in Production for
Kubernetes.
Which is a recommended version for Production provided that both Streaming
and core apis have to be used using Pyspark.
Thanks !
Kind Regards,
Sachit Murarka