[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

mateiz Sun, 16 Mar 2014 15:31:24 -0700

Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/146#issuecomment-37773571
  
    Also a few comments on the doc page 
(http://www.cs.berkeley.edu/%7Emarmbrus/sparkdocs/_site/sql-programming-guide.html):
    * Put Spark SQL closer to the top of the programming guide and API doc 
menus, say under Spark Streaming. I think it will be read more often than some 
of the other ones.
    * On the API docs, list Spark SQL Core first, then Hive Support, and then 
Catalyst Optimizer
    * Each package should have a package.scala with a package-level doc for it 
(see e.g. 
http://www.cs.berkeley.edu/%7Emarmbrus/sparkdocs/_site/api/streaming/index.html#org.apache.spark.streaming.package)
    * The doc should explain the types of various things returned (e.g. what is 
an ExecutedQuery, what does loadFile return)
    * Name the Hive Metastore Support section to just Hive Support, since it 
supports the QL as well.
    * It would be cool to include an example of running MLlib or something 
similar on top of SQL data.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

Reply via email to