Ah I see, good references. Perhaps it's really then a committer judgement
call on how many changes become "too many" for a single PR.
2016년 5월 15일 (일) 오후 11:16, Hyukjin Kwon 님이 작성:
> Thank you so much for detailed explanation and the history.
>
>
> I understood and it seems *ProcedureDeclarationCh
Thank you so much for detailed explanation and the history.
I understood and it seems *ProcedureDeclarationChecker* should not be
enabled.
However, it seems *RedundantIfChecker* okay because there are only two
errors for this across the code base.
I have seen some rules have been added time t
Relevant discussion from some time ago:
https://issues.apache.org/jira/browse/SPARK-3849?focusedCommentId=14168961&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14168961
In short, if enabling a new style rule requires sweeping changes throughout
the code base, then
Hi all,
Lately, I made a list of rules currently not applied on Spark from
http://www.scalastyle.org/rules-dev.html and then I tried to test them.
I found two rules that I think might be helpful but I am not too sure.
Could I ask both can be added?
*RedundantIfChecker *(See
http://www.scalastyl
I've been doing some looking at EclairJS (Spark + Javascript) which takes a
really interesting approach. The driver program is run in node and the
workers are run in nashorn. I was wondering if anyone has given much though
to optionally exposing an interface for PySpark in a similar fashion. For
so
Hi
I am consistently observing driver OutOfMemoryError (Java heap space)
during shuffling operation indicated by the log:
16/05/14 21:57:03 INFO MapOutputTrackerMaster: Size of output statuses for
shuffle 2 is 36060250 bytes à shuffle metadata size is big and the full
metadata will be sent
I don't know about the second one but for question #1:
When you convert from a cached DF to an RDD (via a map function or the
"rdd" value) the types are converted from the off-heap types to on-heap
types. If your rows are fairly large/complex this can have a pretty big
performance impact so I would