I agree for an OSS project all end points that can be called are already
publicly available.
https://security.stackexchange.com/questions/138567/why-should-the-options-method-not-be-allowed-on-an-http-server
has
couple of good reasons though.
"An essential part of security is to reduce the attack
I'm using spark-on-kubernetes to submit spark app to kubernetes.
most of the time, it runs smoothly.
but sometimes, I see logs after submitting: the driver pod phase changed
from running to pending and starts another container in the pod though the
first container exited successfully.
The driver l
If this is correct “This method exposes what all methods are supported by the
end point” , I really don’t understand how’s that a security vulnerability
considering the OSS nature of this project. Are you adding new endpoints to
this webserver?
More info about info/other methods :
https://
Please share the links if they are publicly available. Otherwise please share
the name of the talks. Thank you
From: Jules Damji
Sent: Monday, April 29, 2019 8:04 PM
To: Michael Mansour
Cc: rajat kumar ; user@spark.apache.org
Subject: Re: [EXT] handling skewness issues
Yes, indeed! A f
+ d...@spark.apache.org
On Tue, Apr 30, 2019 at 4:23 PM Ankit Jain wrote:
> Aah - actually found https://issues.apache.org/jira/browse/SPARK-18664 -
> "Don't respond to HTTP OPTIONS in HTTP-based UIs"
>
> Does anyone know if this can be prioritized?
>
> Thanks
> Ankit
>
> On Tue, Apr 30, 2019 at
Aah - actually found https://issues.apache.org/jira/browse/SPARK-18664 -
"Don't respond to HTTP OPTIONS in HTTP-based UIs"
Does anyone know if this can be prioritized?
Thanks
Ankit
On Tue, Apr 30, 2019 at 1:31 PM Ankit Jain wrote:
> Hi Fellow Spark users,
> We are using Spark 2.3.0 and securit
I recommend you to use Structured Streaming as it has a patch that can
workaround this issue: https://issues.apache.org/jira/browse/SPARK-26267
Best Regards,
Ryan
On Tue, Apr 30, 2019 at 3:34 PM Shixiong(Ryan) Zhu
wrote:
> There is a known issue that Kafka may return a wrong offset even if the
There is a known issue that Kafka may return a wrong offset even if there
is no reset happening: https://issues.apache.org/jira/browse/KAFKA-7703
Best Regards,
Ryan
On Tue, Apr 30, 2019 at 10:41 AM Austin Weaver wrote:
> @deng - There was a short erroneous period where 2 streams were reading
>
Hello. I am using Zeppelin on Amazon EMR cluster while developing Apache
Spark programs in Scala. The problem is that once that cluster is destroyed
I lose all the notebooks on it. So over a period of time I have a lot of
notebooks that require to be manually exported into my local disk and from
t
Hi Fellow Spark users,
We are using Spark 2.3.0 and security team is reporting a violation that
Spark allows HTTP OPTIONS method to work(This method exposes what all
methods are supported by the end point which could be exploited by a
hacker).
This method is on Jetty web server, I see Spark uses J
@deng - There was a short erroneous period where 2 streams were reading
from the same topic and group id were running at the same time. We saw
errors in this and stopped the extra stream. That being said, I would think
regardless that the auto.offset.reset would kick in sine documentation says
that
Hi Experts,
I am using spark structured streaming to read message from Kafka, with a
producer that works with at-least once guarantee. This streaming job is
running on Yarn cluster with hadoop 2.7 and spark 2.3
What is the most reliable strategy for avoiding duplicate data within
stream in the sc
On Tue, Apr 30, 2019 at 6:48 PM Vatsal Patel
wrote:
> *Issue: *
>
> When I am reading sequence file in spark, I can specify the number of
> partitions as an argument to the API, below is the way
> *public JavaPairRDD sequenceFile(String path, Class
> keyClass, Class valueClass, int minPartitions
Hi Rishi,
I've had success using the approach outlined here:
https://community.hortonworks.com/articles/58418/running-pyspark-with-conda-env.html
Does this work for you?
On Tue, Apr 30, 2019 at 12:32 AM Rishi Shah
wrote:
> modified the subject & would like to clarify that I am looking to creat
*Issue: *
When I am reading sequence file in spark, I can specify the number of
partitions as an argument to the API, below is the way
*public JavaPairRDD sequenceFile(String path, Class
keyClass, Class valueClass, int minPartitions)*
*In newAPIHadoopFile(), this support has been removed. below
Hi Austin,
Are you using Spark Streaming or Structured Streaming?
For better understanding, could you also provide sample code/config params
for your spark-kafka connector for the said streaming job?
Akshay Bhardwaj
+91-97111-33849
On Mon, Apr 29, 2019 at 10:34 PM Austin Weaver wrote:
> Hey
Hi
NA function will replace null with some default value and not all my
columns are of type string, so for some other data types (long/int etc) I
have to provide some default value
But ideally those values should be null
Actually this null column drop is happening in this step
df.selectExpr( "
please read this to unsubscribe: https://spark.apache.org/community.html
TL;DR: user-unsubscr...@spark.apache.org so no mail to the list
On 4/30/19 6:38 AM, Amrit Jangid wrote:
-
To unsubscribe e-mail: user-unsubscr...@spark
Hi,
It seems koalas.DataFrame can't be displayed in terminal yet as in
https://github.com/databricks/koalas/issues/150 and the work around is
to convert it to pandas DataFrame.
Thanks,
Manu Zhang
On Tue, Apr 30, 2019 at 2:46 PM Achilleus 003
wrote:
> Hello Everyone,
>
> I have been trying to r
19 matches
Mail list logo