Re: spark.sql.codegen.comments not in SQLConf?
It's probably because it is annoying to propagate that using SQL conf. On Wed, May 10, 2017 at 3:38 AM Jacek Laskowski wrote: > Hi, > > It seems that spark.sql.codegen.comments property [1] didn't find its > place in SQLConf [2] that appears to be the place for all Spark > SQL-related properties (for codegen surely). > > Don't think it merits a JIRA issue so just asking here. > > If agreed, I'd like to propose a PR. Thanks. > > [1] > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L822 > [2] > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala > > Pozdrawiam, > Jacek Laskowski > > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: How about the fetch the shuffle data in one same machine?
I don't think it is Loopback only localhost or 127.0.0.1 will go. And the benchmarks test code should be simple don't involve calculate. Just make two test codes one just read the file from local the other just read the file from netty Read the different file size(small -> big), should have different benchmarks. Of cause the memory copy fast than network deliver. On Wed, May 10, 2017 at 6:14 PM, Saisai Shao wrote: > There is a JIRA about this thing (https://issues.apache.org/ > jira/browse/SPARK-6521). In the current Spark shuffle fetch still > leverages Netty even two executors are on the same node, but according to > the test on the JIRA, the performance is close whether to bypass network or > not. From my understanding, kernel will not transfer data into NIC if it is > just a loopback communication (please correct me if I'm wrong). > > On Wed, May 10, 2017 at 5:53 PM, raintung li > wrote: > >> Hi all, >> >> Now Spark only think the executorId same that fetch local file, but for >> same IP different ExecutorId will fetch using network that actually it can >> be fetch in the local Or Loopback. >> >> Apparently fetch the local file that it is fast that can use the LVS >> cache. >> >> How do you think? >> >> Regards >> -Raintung >> > >
spark.sql.codegen.comments not in SQLConf?
Hi, It seems that spark.sql.codegen.comments property [1] didn't find its place in SQLConf [2] that appears to be the place for all Spark SQL-related properties (for codegen surely). Don't think it merits a JIRA issue so just asking here. If agreed, I'd like to propose a PR. Thanks. [1] https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L822 [2] https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: How about the fetch the shuffle data in one same machine?
There is a JIRA about this thing ( https://issues.apache.org/jira/browse/SPARK-6521). In the current Spark shuffle fetch still leverages Netty even two executors are on the same node, but according to the test on the JIRA, the performance is close whether to bypass network or not. From my understanding, kernel will not transfer data into NIC if it is just a loopback communication (please correct me if I'm wrong). On Wed, May 10, 2017 at 5:53 PM, raintung li wrote: > Hi all, > > Now Spark only think the executorId same that fetch local file, but for > same IP different ExecutorId will fetch using network that actually it can > be fetch in the local Or Loopback. > > Apparently fetch the local file that it is fast that can use the LVS > cache. > > How do you think? > > Regards > -Raintung >
How about the fetch the shuffle data in one same machine?
Hi all, Now Spark only think the executorId same that fetch local file, but for same IP different ExecutorId will fetch using network that actually it can be fetch in the local Or Loopback. Apparently fetch the local file that it is fast that can use the LVS cache. How do you think? Regards -Raintung