Re: Guidelines for writing SPARK packages
Hi, A package I maintain (https://github.com/maropu/hivemall-spark) extends existing SparkSQL/DataFrame classes for a third-party library. Please use this as a concrete example. Thanks, takeshi On Tue, Feb 2, 2016 at 6:20 PM, Praveen Devarao wrote: > Thanks David. > > I am looking at extending the SparkSQL library with a custom > package...hence was looking at more from details on any specific classes to > be extended or implement (with) to achieve the redirect of calls to my > module (when using .format). > > If you have any info on these lines do share with me...else debugging > through would be the way :-) > > Thanking You > > Praveen Devarao > > > > From:David Russell > To:Praveen Devarao/India/IBM@IBMIN > Cc:user > Date: 01/02/2016 07:03 pm > Subject:Re: Guidelines for writing SPARK packages > Sent by:marchoffo...@gmail.com > -- > > > > Hi Praveen, > > The basic requirements for releasing a Spark package on > spark-packages.org are as follows: > > 1. The package content must be hosted by GitHub in a public repo under > the owner's account. > 2. The repo name must match the package name. > 3. The master branch of the repo must contain "README.md" and "LICENSE". > > Per the doc on spark-packages.org site an example package that meets > those requirements can be found at > https://github.com/databricks/spark-avro. My own recently released > SAMBA package also meets these requirements: > https://github.com/onetapbeyond/lambda-spark-executor. > > As you can see there is nothing in this list of requirements that > demands the implementation of specific interfaces. What you'll need to > implement will depend entirely on what you want to accomplish. If you > want to register a release for your package you will also need to push > the artifacts for your package to Maven central. > > David > > > On Mon, Feb 1, 2016 at 7:03 AM, Praveen Devarao > wrote: > > Hi, > > > > Is there any guidelines or specs to write a Spark package? I > would > > like to implement a spark package and would like to know the way it > needs to > > be structured (implement some interfaces etc) so that it can plug into > Spark > > for extended functionality. > > > > Could any one help me point to docs or links on the above? > > > > Thanking You > > > > Praveen Devarao > > > > -- > "All that is gold does not glitter, Not all those who wander are lost." > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > > > > -- --- Takeshi Yamamuro
Re: Guidelines for writing SPARK packages
Thanks David. I am looking at extending the SparkSQL library with a custom package...hence was looking at more from details on any specific classes to be extended or implement (with) to achieve the redirect of calls to my module (when using .format). If you have any info on these lines do share with me...else debugging through would be the way :-) Thanking You Praveen Devarao From: David Russell To: Praveen Devarao/India/IBM@IBMIN Cc: user Date: 01/02/2016 07:03 pm Subject:Re: Guidelines for writing SPARK packages Sent by:marchoffo...@gmail.com Hi Praveen, The basic requirements for releasing a Spark package on spark-packages.org are as follows: 1. The package content must be hosted by GitHub in a public repo under the owner's account. 2. The repo name must match the package name. 3. The master branch of the repo must contain "README.md" and "LICENSE". Per the doc on spark-packages.org site an example package that meets those requirements can be found at https://github.com/databricks/spark-avro. My own recently released SAMBA package also meets these requirements: https://github.com/onetapbeyond/lambda-spark-executor. As you can see there is nothing in this list of requirements that demands the implementation of specific interfaces. What you'll need to implement will depend entirely on what you want to accomplish. If you want to register a release for your package you will also need to push the artifacts for your package to Maven central. David On Mon, Feb 1, 2016 at 7:03 AM, Praveen Devarao wrote: > Hi, > > Is there any guidelines or specs to write a Spark package? I would > like to implement a spark package and would like to know the way it needs to > be structured (implement some interfaces etc) so that it can plug into Spark > for extended functionality. > > Could any one help me point to docs or links on the above? > > Thanking You > > Praveen Devarao -- "All that is gold does not glitter, Not all those who wander are lost." - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Guidelines for writing SPARK packages
Thanks for the reply David, just wanted to fix one part of your response: > If you > want to register a release for your package you will also need to push > the artifacts for your package to Maven central. > It is NOT necessary to push to Maven Central in order to make a release. There are many packages out there that don't publish to Maven Central, e.g. scripts, and pure python packages. Praveen, I would suggest taking a look at: - spark-package command line tool ( https://github.com/databricks/spark-package-cmd-tool), to get you set up - sbt-spark-package (https://github.com/databricks/sbt-spark-package) to help with building/publishing if you plan to use Scala in your package. You could of course use Maven as well, but we don't have a maven plugin for Spark Packages. Best, Burak
Re: Guidelines for writing SPARK packages
Hi Praveen, The basic requirements for releasing a Spark package on spark-packages.org are as follows: 1. The package content must be hosted by GitHub in a public repo under the owner's account. 2. The repo name must match the package name. 3. The master branch of the repo must contain "README.md" and "LICENSE". Per the doc on spark-packages.org site an example package that meets those requirements can be found at https://github.com/databricks/spark-avro. My own recently released SAMBA package also meets these requirements: https://github.com/onetapbeyond/lambda-spark-executor. As you can see there is nothing in this list of requirements that demands the implementation of specific interfaces. What you'll need to implement will depend entirely on what you want to accomplish. If you want to register a release for your package you will also need to push the artifacts for your package to Maven central. David On Mon, Feb 1, 2016 at 7:03 AM, Praveen Devarao wrote: > Hi, > > Is there any guidelines or specs to write a Spark package? I would > like to implement a spark package and would like to know the way it needs to > be structured (implement some interfaces etc) so that it can plug into Spark > for extended functionality. > > Could any one help me point to docs or links on the above? > > Thanking You > > Praveen Devarao -- "All that is gold does not glitter, Not all those who wander are lost." - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org