Hi Folks,
We have some Dataset/Dataframe use cases that will benefit from reuse the
SparkPlan and shuffle stage.
For example, the following cases. Because the query optimization and
sparkplan is generated by catalyst when it is executed, as a result, the
underlying RDD lineage is regenerated for
+1 if the docs can be exposed more.
On 19 Oct 2016 2:04 a.m., "Shivaram Venkataraman" <
shiva...@eecs.berkeley.edu> wrote:
> +1 - Given that our website is now on github
> (https://github.com/apache/spark-website), I think we can move most of
> our wiki into the main website. That way we'll only
On 17 Oct 2016, at 18:26, Ryan Blue
mailto:rb...@netflix.com>> wrote:
Are these changes that the Hive community has rejected? I don't see a
compelling reason to have a long-term Spark fork of Hive.
More changes in hive that haven't been picked up
HIVE-11720 is needed to handle very long HTTP
+1 - Given that our website is now on github
(https://github.com/apache/spark-website), I think we can move most of
our wiki into the main website. That way we'll only have two sources
of documentation to maintain: A release specific one in the main repo
and the website which is more long lived.
T
Is there any way to tie wiki accounts with JIRA accounts? I found it weird that
they're not tied at the ASF.
Otherwise, moving this into the docs might make sense.
Matei
> On Oct 18, 2016, at 6:19 AM, Cody Koeninger wrote:
>
> +1 to putting docs in one clear place.
>
> On Oct 18, 2016 6:40 A
I think what Reynold means is that if its easy for a developer to build
this convenience function using the current Spark API it probably doesn't
need to go into Spark unless its being done to provide a similar API to a
system we are attempting to be semi-compatible with (e.g. if a
corresponding co
Sorry, by API you mean by use of 3rd party libraries or user code or
something else?
Thanks
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/On-convenience-methods-tp19460p19496.html
Sent from the Apache Spark Developers List mailing list archive at Na
+1 to putting docs in one clear place.
On Oct 18, 2016 6:40 AM, "Sean Owen" wrote:
> I'm OK with that. The upside to the wiki is that it can be edited directly
> outside of a release cycle. However, in practice I find that the wiki is
> rarely changed. To me it also serves as a place for informa
I'm OK with that. The upside to the wiki is that it can be edited directly
outside of a release cycle. However, in practice I find that the wiki is
rarely changed. To me it also serves as a place for information that isn't
exactly project documentation like "powered by" listings.
In a way I'd like
Right now the wiki isn't particularly accessible to updates by external
contributors. We've already got a contributing to spark page which just
links to the wiki - how about if we just move the wiki contents over? This
way contributors can contribute to our documentation about how to
contribute pro
Hi Krishna,
Thanks for your interest contributing to PySpark! I don't personally use
either of those IDEs so I'll leave that part for someone else to answer -
but in general you can find the building spark documentation at
http://spark.apache.org/docs/latest/building-spark.html which includes
note
Hello,
I am a masters student. Could someone please let me know how set up my dev
working environment to contribute to pyspark.
Questions I had were:
a) Should I use Intellij Idea or PyCharm?.
b) How do I test my changes?.
Regards,
Krishna
Hi,
I hope it is the right forum.
I am looking for some information of what to expect from
StructuredStreaming in its next releases to help me choose when / where to
start using it more seriously (or where to invest in workarounds and where
to wait). I couldn't find a good place where such planning
13 matches
Mail list logo